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IDEAL 7, 1994 



ESL STXJDENTS AND COMMON L2 CONVERSATION-MAKING EXPRESSIONS' 



Eli Hinkel 



Ritualized American conversation-making expressions have the purpose 
of facilitating social interactions. However, pragmatic failure can occur ifNNSs 
(non-native speakers) misinterpret the pragmatic force of these conversational 
expressions. This study focuses on the differences in NS (native speaker) and 
NNS interpretations of the pragmatic force in twenty common conversation- 
making expressions (Coulmas, 1979), such as It's really cold today and How 
long have you been in the U.S.? In a survey based on rankings of 
multiple-choice selections, recognition of the pragmatic force in ritualized 
expressions by highly-trained NNSs was unsystematic compared to that by NSs. 

The NSs' and NNSs' responses did not exhibit strong correlation for any of the 
given conversational expressions. Moreover, a proportion of NNSs interpreted 
the pragmatic force of some of these expressions as that of a face-threatening act. 

These findings imply that NNSs need to be taught conversation-making 
expressions and the forms hat interactional exchanges can take, 

INTRODUCTION 

ESL teachers and American students frequently note that international students often 
remain outside observers and do not take an active part in L2 peer-group social interactions, 
NNSs may not participate in conversational exchanges for a variety of reasons, but one 
contributing factor could be that NNSs do not always understand the pragmatic force in common 
and ritualized American conversation-making expressions. 

Conversation-making expressions, such as What's up?, How are you doing?, and It's 
really cold/hot today, are a fi*equently used in casual social interactions that take place daily in 
offices, college classes, and campus lounges and restaurants, i,e, the locales where students 
interact with one another. Such expressions, routinely used in brief social encounters, are 
conversational moves (statements or questions) that serve means of initiating and maintaining 
conversations. They represent "a mark of fnendship or interest" or "rapport-inspiring activity" 
(Brown & Levinson, 1987, p. 117) and have the purpose of facilitating social interactions (Bach 
& Hamish, 1979, p. 62). Conversation-making expressions are rarely expected to be profound 
and usually involve weather, an object or person in the immediate environment, or something 
that the speaker believes he or she has in common with the hearer that can help promote 
"conversational cooperativeness" and "get a conversation going" (Wardhaugh, 1985, p. 22), A 
typical exchange can run along the lines of 

(1) A: How are you this morning? 

B: Fine. And you? 

A: Fine. Did you hear they predicted snow for this afternoon? 
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However, the use of ritualized conversational expressions necessitates the participants' 
ability to assign them a pragmatic force and recognize their communicative purpose (Leech, 
1983). For example. What did you do over the weekend? is probably an attempt at 
conversation-making and may require a reasonably detailed response; on the other hand, the 
question How was your weekend? could be used as a greeting, depending on context, and may 
not be intended to elicit a lengthy response. To complicate matters. Leech (1977) and Brown 
& Levinson (1987) have established that pragmatic force is frequently ambivalent, and context 
does not always remove the ambiguity. 

Although a great deal of research has focused on various forms of the L2 speech act, 
relatively little has addressed learner interpretations of conversational discourse. Among the 
few, Devine (1982) and Bouton (1989) have investigated how NSs and NNSs interpreted 
conversational implicature. They found that NNSs frequently misunderstood the speaker's 
communicative purpose. Similarly, Bouton (1990) concluded that even NNSs with a relatively 
high proficiency have difficulty interpreting implications in conversations and hence should be 
taught the appropriate skills. 

A misinterpretation of the speaker's purpose can be particularly damaging and can result 
in pragmatic failure (Thomas, 1983) if the hearer interprets it as a threat to his or her face, i.e. 
"an expression of disapproval, criticism, contempt, or ridicule, ... [and] insult" when the hearer 
perceives that the speaker does not like "one or more [of the hearer's] acts, personal 
characteristics, ... beliefs or values" (Brown & Levinson, 1987, p. 66). For example, an 
advanced Chinese student responded '*Why do you ask me? Did I do something wrong? " to the 
conversational expression *How is everything going?** and had to be reassured that the 
expression had no other goal than to show friendly interest. 

The purpose of this study is to investigate whether advanced and highly-trained NNSs 
recognize the pragmatic force of common English conversation-making expressions and 
understand them in ways NSs do. Specifically, the investigation focused on two issues: 

(1) whether the pragmatic force of common and ritualized conversation-making 
expressions as a means for initiating and/or maintaining conversations is consistently and 
similarly recognized by NSs and NNSs, and 

(2) if NSs or NNSs don't interpret the pragmatic force intended in such expressions as 
primarily that of initiating and/or maintaining conversations, how might they interpret the 
pragmatic force of such expressions. The second part of this study investigated the possibility 
that NNSs may perceive a threat to their face from the conversation-making expressions. 



METHOD 

Research in L2 learner perceptions of the appropriateness and purpose of communicative 
routines is limited to observing conversations (Wolfson, 1986; Takahashi & Beebe, 1987) or 
obtaining data through various questionnaire formats. While open-ended instruments and 
"naturalistic" approaches to data gathering, such as audio- and video-taping are frequently prone 
to problems associated with interpreting subjects' responses and controlling for extralinguistic 
variables, multiple-choice questionnaires, within limitations, have been proven to be a more 
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eflFective measure of subjects’ ability to interpret a speaker’s conversational implicature (Bouton, 
1989). Since the pragmatic force of an utterance is one form of conversational implicature 
(Thomas, 1983), the multiple-choice format was chosen as an appropriate instrument for 
identifying hearer perceptions of the pragmatic force in conversation-making expressions. 

Subjects 

All participants in the study were enrolled in the Ohio State University. Of the 99 ESL 
students, 48 were speakers of Chinese (CH), 20 each of Indonesian (IN) and Korean (KR), 6 
Arabic (AR), and 5 Spanish (SP). The NNSs represented a highly advanced group of language 
learners with a mean TOEFL score of 587. Their residence in the U.S. typically fell within the 
range of 1 to 4 years, with a mean of 1.7. It follows that all NNSs had had some exposure to 
their host culture and L2 conversational routines. 

In addition to the NNSs, a group of 24 NS undergraduate students, raised in Ohio, 
Indiana, and Kentucky, participated in the study. All NSs were taking a required composition 
course and were enrolled in various departments in the University. This study compares the 
pragmatic force that the NNS and NS students assigned to the questionnaire expressions. 

The Questionnaire 

To delineate the social distance in questionnaire situations, a peer acquaintance was 
briefly outlined: 

When you are responding to the questions, please keep in mind the following imaginary 
student: N.H. is a student in your department. You have similar interests in your majors. You 
have talked to N.H. several times in the department lounge. 

Following this description, 24 interactional situations were presented, in which the 
speaker used ritualized conversation-making expressions routinely heard in everyday exchanges, 
such diS How long have you beeninthe U.S.? 2 lx\A I haveWt seen you in a while {see Table 1 for 
a complete list). The situations and the conversational expressions came from four ESL 
speaking-skills texts (Sharpe, 1984; Tillitt & Bruder, 1985; Helgesen, Mandeville & Jordan, 
1986; Tansey & Blatchford, 1987) which work with conversation-making rituals frequently used 
in American English (see Table I). Each item in the questionnaire adhered to the same format, 
a situation was briefly described; following the situation, three multiple-choice selections were 
presented. For example: 

[19] N.H. says to you: ’’Where do you live in Columbus?" By saying this, N.H. 

(A) ind icates that N.H. wants to visit you 

(B) wants to compare apartments with you 

(C) tries to make conversation 

[6] You are talking to N.H. in the library. N.H. says to you: ’’It’s really cold today.’’ By 
saying this, N.H. 

(A) shows that N.H. does not want to study 

(B) indicates N.H. the library is too far from the main campus 

(C) tries to make conversation 



( 2 ) 



( 3 ) 
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One of the choices remained constant for all items and identified the statement or the 
question presented in the situation as a conversation-making expression {tries to make 
conversation)^ (see Appendix for the text of the questionnaire). According to the questionnaire 
instructions, the subjects were to rank the multiple choice item that best described the speaker's 
communicative purpose as 1; the second best selection was ranked 2, and the least applicable 

3. The first four situations were used as a "warm-up" and were omitted from the data analysis. 

Table 1 

Conversational Expressions in the Questionnaire 



Not used in data analysis: 



Used in Data Analysis: 



1 . I have always wanted to travel abroad. 

2. How have you been lately? 

3. Is that a new coat? 

4. Where are you from? 



5. What are you taking this quarter? 

6. It's really cold today, [conversation in the library] 

7. Are you going home for the break? 

8. There are many foreign students on campus. 

9. You should go to New York. New York is a lot of fiin. 

10. It's really cold today, [conversation in the street] 

1 1 . How long have you been in the U.S.? 

12. How do you like Columbus? 

13. Do you like it in America? 

14. Did you visit your family during the vacation? 

15. Two years ago, my fnend had a roommate who was a foreign student. 

16. Have you been to California? 

17. Do you plan to go back to your country after you graduate? 

18. Do you like American food? 

19. Where do you live in Columbus? 

20. It is probably difficult to study in a foreign country. 

21. Do you have American fnends? 

22. OSU is a very large university. 

23. I haven't seen you in a while. 

24. In my economics class, we have a student from your country. 

In situations #5-24, 10 included neutral or positive statements without a threat to the 
hearer's face (Brown & Levinson, 1987), as in (2-3). The other 10 contained one neutral or 
positive statement and one statement which presented a threat to the hearer's face, as in (4c) and 
(5c). 



(4) [8] N.H. says to you: "There are many foreign students on campus." By saying this, N.H. 

(A) indicates that OSU is an internationally famous university 

(B) tries to make conversation 

(C) indicates that there are too many foreign students on campus 
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(5) [24] N.H. says to you: "In my economics class, we have a student from your country." By 

saying this, N.H. 

(A) tries to make conversation 

(B) indicates that N.H. likes foreign students 

(C) indicates that N.H. dislikes foreign students 

In the situations and the accompanying multiple-choice selections, all references to gender, 
age, nationality, and native language were avoided; no pronouns, except the first person singular 
and second person singular were used. Passive voice and modal verbs, except for should, were 
excluded; the vocabulary utilized in the questionnaire was limited so as not to impede the task. 



RESULTS AND ANALYSIS 

The participants' responses were analyzed for each situation, 5 through 24. The rankings 
(1 - 'best describes the speaker's purpose' through 3 - 'least applicable') given by the various 
participants to the choice tries to make conversation', were summed and averaged. For example, 
the average rankings given to this selection in situation #5, What are you taking this quarter?, 
were 1.04 by NS, 1.33 by Indonesians, 1.65 by Chinese, 1.85 by Spanish-speakers, 2.20 by 
Koreans, and 2.79 by Arabic-speakers. The fact that the NS value is slightly greater than 1.00 
indicates that not all NSs considered the primary pragmatic force of the question. What are you 
taking this quarter? as conversation-making. 

Kendall's Coefficient of Concordance (W) was computed for average rankings given by 
all groups of participants across the conversational expressions #5-24 (Table 2). A IF value of 
1.00 indicates perfect concordance (or consistency) between the rankings; the closer this value 
is to 0, the more randomly the rankings are assigned. 

Table 2 

Kendall's Coefficient of Concordance (W) 

Rank Orderings of Conversational Expressions by NSs and NNSs 



# 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 17 


18 


19 


20 


21 


22 


23 


24 


NS 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


CH 


4 


3 


6 


4 


4 


3 


4 


2 


3 


3 


4 


4 


3 


2 


2 


6 


3 


3 


2 


3 


IN 


5 


2 


3 


5 


6 


4 


5 


6 


4 


4 


6 


5 


4 


4 


4 


5 


5 


2 


3 


5 


KR 


3 


5 


4 


3 


5 


2 


2 


5 


5 


5 


3 


3 


5 


3 


5 


2 


2 


4 


4 


2 


AR 


6 


6 


5 


6 


2 


5 


6 


4 


6 


2 


5 


6 


2 


6 


3 


3 


6 


5 


5 


6 


SP 


2 


4 


2 


2 


3 


6 


3 


3 


2 


6 


2 


2 


6 


5 


6 


4 


4 


6 


6 


4 


By NSs 


and 


[ NNS 


combined: 


W 


= 


50 


(P 


< . 


01) 
















By NNSs only: 




W = 


. 12 


(P 


= 


.05) 






















By NSs 


only: 


W = 


.998 


(P 


< 


.0001) 























The value W = .50 (p < .01) indicates a marginal degree of concordance in the 
subjects’ rankings; the value W = .12 (p = .05) computed for the five NNS groups shows 
only weak concordance and approaches random state. Another Coefficient of Concordance 
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computed for rankings given by NSs approximated 1.00 (p < .001). The disparity between 
W for exclusively NSs {W = .998) and all groups of NNSs (. 12) suggests that the rankings 
given by NNSs to the pragmatic force of expressions #5-24 were markedly different from 
those given by Nss. 

To determine whether the rankings exhibited substantial correlations between groups, 
a Spearman’s Rank Correlation Coefficient matrix was computed (Table 3) using the 
previously discussed average rankings by groups. Excluding the cell entries along the 
diagonal, the values in only 4 cells equaled or exceeded the critical test value of .45 (p = 
.05) and exhibited significant, albeit marginal correlations between the rankings of NSs and 
Indonesians (.53), NSs and Koreans (.47), NSs and Arabs (.45), and the Chinese and 
Indonesians (.52). The other 11 cells in the matrix show low positive or negative 
correlations that are not significant. 



Table 3 

Spearman’s Rank Correlation Coefficient Matrix 





NS 


CH 


IN 


KR 


AR 


NS 


1.00 










CH 


0.36^^ 


1.00* 








IN 


0.53^^ 


0.52 


1.00 






KR 


0.47* 


0.22 


0.14 


1.00 




AR 


0.45 


0.42 


0.30 


“0. 16 


1.00 


SP 


0.40 


0. 19 


0.37 


0.26 


“0.02 


lignif leant 


at (p = 


.05 2“tailed 


test ) 







The data in Tables 2 and 3 indicate that overall, NNSs did not interpret the pragmatic 
force of common conversation-making expressions the way that NS did. 

The second part of the study is concerned with how NNS do perceive some of these 
conversational devices. As has been mentioned, 10 items in the questionnaire contained a 
multiple-choice selection with an explicit threat to the hearer’s face, as in (6a): 

(6) [23] N.H. says to you: "I haven’t seen you in a while." By saying this, N.H. 

(A) indicates that N.H. thinks you are not a good friend 

(B) shows that N.H. wants to be your friend 

(C) wants to greet you and tries to make conversation 

In situations with threat to face, only one of the NSs (4%) in only one of the situations 
(see (17) in Table 4) ranked such a selection as best describing the pragmatic force of the 
conversation-making expressions #5-24 (Table 4). Almost all NSs ranked such items as the 
least likely interpretations of the pragmatic force of the speaker’s statements or questions. 
However, a proportion of the NNSs ranked them as best describing the pragmatic force in 
the situations (see Table 4). These rankings indicate that NNSs often interpreted some of 
the conversation-making expressions listed in #5-24 as disparagement. 
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Table 4 

Perceiving Threat to Hearer’s Face in Conversational Expressions 
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N=123 NSs Chinese Indonesian Korean Arabic Spanish 

# (n=24) (n=48) (n=20) (n=20) (n=6) (n=5) 

"/• "/• % % % % 



8. There are many foreign students on campus. 

0 19 20 5 50 20 

9. You should go to New York. New York is a lot of fun. 

0 15 20 25 0 0 

11. How long have been been in the U.S.? 

0 19 0 15 0 0 

13. Do you like it in America? 

0 27 10 25 0 0 

15. Two years ago, my friend had a roommate who was a foreign student. 

0 10 10 0 0 0 

17. Do you plan to go back to your country after you graduate? 

4 29 15 25 67 20 

20. It is probably difficult to study in a foreign country. 

0 13 20 15 50 20 

21. Do you have American friends? 

0 13 10 15 0 0 

23. I haven’t seen you in a while. 

0 15 15 - 0 0 0 

24. In my economics class, we have a student from your country. 

0 8 0 15 0 0 



CONCLUSIONS 

The results of this study demonstrate that the NNS interpretations of the pragmatic 
force in common and ritualized conversation-making expressions were almost random, while 
NS interpretations were highly consistent. It seems clear that NNSs frequently 
misunderstand the pragmatic force of common American conversation-making expressions. 
The most disconcerting finding of this study is that NNSs occasionally interpreted the 
primary communicative purpose behind some conversation-making expressions as disapproval 
or disregard. While further investigation of NNSs’ understanding of conversation-making 
expressions is undoubtedly necessary, it seems evident that NNS may need to be explicitly 
taught conversational skills and the usage of social conversation-making expressions. 
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NOTES 

*A draft of this paper was presented at the Sixth Annual Conference on Pragmatics and 
Language Learning at the University of Illinois at Urbana-Champaign. 

^The multiple-choice options for all 24 questions (except tries to make conversation) 
were suggested by 25 NNSs during the pilot stages of the study. Of the 25 NNS, 14 were 
speakers of Chinese, 5 Korean, 2 each Indonesian, Arabic, and Spanish. 



REFERENCES 

Bach, Kent & Hamish, Robert. (1979). Linguistic communication and speech acts, 
Cambridge, MA: The MIT Press. 

Bouton, Lawrence. (1989). So they got the message, but how did they get it? IDEALy 4, 
119-148. 

Bouton, Lawrence. (1990). The effective use of implicature in English: Why and how it 
should be taught in the ESL classroom. In L. Bouton & Y. Kachru (Eds.), Pragmatics 
and Language Learning, 1, 43-52. Brown, Penelope, & Levinson, Stephen. (1987). 
Politeness. Reprinted. Cambridge: Cambridge University Press. 

Coulmas, Florian. (1979). Routine formulae. Journal of Pragmatics, 3, 219-38. 

Devine, Joanne. (1982). A question of universality: Conversational principles and 
implication. In M.A. Clarke & J. Handscombe (Eds.), On TESOL *82, 191-206. 
Washington, D.C.: TESOL. 

Helgesen, Marc, Mandeville, Thomas & Jordan, Robin. (986). English firsthand, San 
Francisco, CA: Lateral Communications. 

Leech, Geoffrey. (1977). LAUT Series, 46. University of Trier. 

Leech, Geoffrey. (1983). Principles of Pragmatics, London: Longman. 

Sharpe, Pamela. (1984). Talking with Americans, Boston: Little and Brown. 

Takahashi, Tomoko, & Leslie Beebe. (1987). The development of pragmatic competence by 
Japanese learners of English. JALT Journal, 8, 131-55. 

Tansey, Catherine, & Blatchford, Charles. (1987). Understanding conversations, Belmont, 
CA: Wadsworth. 

Thomas, Jenny. (1983). Cross-cultural pragmatic failure. Applied Linguistics 

Tillitt, Bruce, & Bruder, Mary. (1985). Speaking naturally, Cambridge: Cambridge 
University Press. 

Wardhaugh, Ronald. (1985). How conversation works, Oxford: Basil Blackwell. 

Wolfson, Nessa. (1986). Research methodology and the question of validity. TESOL 

Quarterly, 20, 689-99. 




13 



APPENDIX 
The Questionnaire 



9 



1. N.H. says to you: "I have always wanted to travel abroad.” By saying this, N.H. 

(A) indicates that N.H. is jealous of you because you are in a foreign country 

(B) indicates that N.H. likes you because you are from a foreign country 

(C) tries to make conversation 

2. You see N.H. by the vending machines in the hallway. N.H. says to you: "How have 
you been lately?” By saying this, N.H. 

(A) indicates that N.H. is concerned about you 

(B) wants to find out how you like the teachers in your classes 

(C) wants to greet you and tries to make conversation 

3. N.H. says to you: "Is that a new coat?” By saying this, N.H. 

(A) shows that N.H. wants to try on your new coat 

(B) tries to make conversation 

(G) N.H. thinks you have a lot of money to buy a new coat 

4. N.H. says to you: "Where are you from?” By saying this, N.H. 

(A) shows that N.H. does not like foreign students 

(B) shows interest in your country 

(C) tries to make conversation 

5. N.H. says to you: "What are you taking this quarter?” By saying this, N.H. 

(A) wants to find out what grades you have been getting 

(B) tries to make conversation 

(C) shows that N.H. wants your help with a homework assignment 

7. N.H. says to you: "Are you going home for the break?" By saying this, N.H. 

(A) indicates that all students should go home for the break 

(B) indicates that you should invite N.H. to visit you 

(C) tries to make conversation 

9. N.H. says to you: "You should go to New York. New York is a lot of fun.” By 
saying this, N.H. 

(A) indicates that N.H. thinks that New York is the most important city in the 
country 

(B) tries to make conversation 

(C) indicates that you have not travelled enough 

10. You are talking to N.H. in the street. N.H. says to you: "It’s really cold today." By 
saying this, N.H. 

(A) indicates that N.H.’s clothing is not appropriate for the weather 

(B) indicates that students from warmer climates are probably uncomfortable in Ohio 

(C) tries to make conversation 
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11. N.H. says to you: "How long have been been in the U.S.?" By saying this, N.H. 

(A) tries to make conversation 

(B) shows that N.H. thinks your English is good 

(C) shows that N.H. thinks your English is not good 

12. N.H. says to you: "How do you like Columbus?" By saying this, N.H. 

(A) wants to know whether you miss your family 

(B) tries to make conversation 

(C) shows that N.H. likes Columbus 

13. N.H. says to you: "Do you like it in America?" By saying this, N.H. 

(A) indicates that if you are in America, you probably like it better here than in your 
country 

(B) indicates that you should tell N.H. about your country 

(C) tries to make conversation 

14. N.H. says to you: "Did you visit your family during the vacation?" By saying this, 
N.H. 

(A) indicates that you have been away from your family for a long time 

(B) tries to make conversation 

(C) indicates that if you feel lonely, you should visit N.H. 

15. N.H. says to you: "Two years ago, my friend had a roommate who was a foreign 
student." By saying this, N.H. 

(A) tries to make conversation 

(B) indicates that N.H. likes foreign students 

(C) indicates that N.H. dislikes foreign students 

16. N.H. says to you: "Have you been to California?" By saying this, N.H. 

(A) indicates that N.H. thinks that California is the best state in the country 

(B) tries to make conversation 

(C) indicates that N.H. dislikes California 

18. N.H. says to you: "Do you like American food?" By saying this, N.H. 

(A) shows pride in American restaurants 

(B) indicates that N.H. likes American food 

(C) tries to make conversation 

20. N.H. says to you: "It is probably difficult to study in a foreign country." By saying 
this, N.H. 

(A) indicates that N.H. thinks you have a heavy accent 

(B) indicates that N.H. thinks your English is excellent 

(C) tries to make conversation 

21. N.H. says to you: "Do you have American friends?" By saying this, N.H. 

(A) indicates that having foreign students as friends is probably exciting 

(B) indicates that foreign students should not try to make friends with Americans 

(C) tries to make conversation 
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22. N.H. says to you: "OSU is a very large university." By saying this, N.H. 

(A) indicates that N.H. is a good student 

(B) tries to make conversation 

(C) indicates that N.H. is not a good student 
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THEMATIC OPTIONS IN REPORTING PREVIOUS RESEARCH 



Sarah Thomas and Thomas Hawes 



This paper discusses one identifying feature of journal articles-reports 
or references to previous research. The study focuses on an examination of 
the formal and syntactic choices available to writers for making reporting 
statements and the conditions governing such choices. One factor contributing 
to the variety of reporting devices available in academic articles seems to be 
the writer’s choice of a particular element of the message as the theme of the 
reporting statement. The purpose of this article is to examine thematic 
options and their distribution to discover patterns of choice characteristic of 
reporting and of the research articles in which they occur. 

We investigate the way in which the choice of theme affects the 
syntactic form of the reports and suggest that it is possible to draw up a 
typology of reports based on participant subject in the theme. Reports were 
categorized as having agent themes, text reference themes, or content term 
themes. With these three main choices for theme, writers create variations 
when textual, interpersonal, or ideational elements in the form of 
circumstantial adjuncts work in conjunction with the subject headword. The 
different syntactic forms of reports resulting from different thematic choices 
are hypothesized to be associated with the function of reports in their 
contexts. 



INTRODUCTION 

The academic research article is generally recognized as the prime means for the 
communication of the most current research findings and the presentation of knowledge 
claims. Swales (1990) describes it as a written text reporting on some investigation carried 
out by its authors, and which usually relates the findings within it to those of others. The 
phenomenal growth of the research article as a genre means that it can be identified on two 
levels: one, in broad terms by a recognizable communicative purpose and by the presence 
of characteristic features with standardized form, function, and layout/presentation which are 
regarded as part of its general conventions. Secondly, some characteristics of the features 
of research articles are specific to different Helds or disciplines and enable the recognition 
of sub-genres, such as literary research articles or experimental research articles. A 
disprojxirtionately large number of the linguistic descriptions resulting from genre analysis 
have focused on the academic research article. This concern of genre analysis with the 
research article has often been motivated by jiedagogical purposes, namely, the need for 
teaching materials in ESP/EAP courses. Since ESP deals with texts and events directly 
related to learners Helds of study or occupation, the emphasis has been on examining the 
actual language of these situations. The particular disciplines students are in make demands 
on them to read and write in particular genres. The rational is that pedagogy which aims 
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at enabling learners to master these genres must be based on or at least informed by 
linguistic analyses characterizing these genres. 

Genre descriptions of research articles vary in terms of their focus which may be on 
the whole article or on selected sections such as the introduction or the discussion sections. 
Another dimension of variation is the decision to select a particular feature for study, for 
example, tense or nominals, or, alternatively, to focus on the overall rhetorical structure of 
the article. One identifying feature of the academic articles which has been of interest to 
researchers is reporting or references to previous research. 

Swales (1981, 1990) has investigated reporting and has identified it as the second of 
four structurally sequenced moves in the introduction sections of articles. Swales’ original 
model for article introductions has now been modified as other investigators have pointed 
out the limitations to its generalizability. A number of researchers have found that Move 
2, References to Previous Research, cannot always be identified as a single structural unit 
separate from Move 1 (Crookes, 1986; Bley-Vroman & Selinker, 1984), and that Move 2 
may be entirely absent or found anywhere in the introduction (Jacoby, 1987). Dudley-Evans 
(1986), in his analysis and application of Swales’ model to postgraduate dissertations, has 
identified References to Previous Research in discussion sections. 

Another kind of study of reporting in research articles is citation analysis in which 
counts of references to other researchers or reporting statements have been employed as a 
way of assessing the contribution of researchers to the field. This is not of interest to those 
investigating the way the feature operates and its use in research articles. Other studies of 
citations have focused on the use of reporting verbs to communicate evaluation (Thompson 
and Ye, 1991), tense forms and sentence function (Oster, 1981; Lackstrom, Selinker & 
Trimble, 1972), and choice of voice and its relation to sentence function (Tarone et al., 
1981). 



This study is an attempt to add to currently available research findings on reports by 
investigating some of the factors which determine the varying syntactic forms of reporting 
statements. 



THE STUDY AND ITS BACKGROUND 

The present research stemmed from a need to investigate some of the characteristic 
features of academic articles which might serve as a point of departure for the development 
of teaching materials. A preliminary look at some representative articles indicated that 
rqx)rting was a feature that recurred throughout the article and not just predominantly in the 
introduction sections. 

Furthermore, reporting is a great deal more frequent and important than learners’ 
EAP textbooks acknowledge. Teachers of EAP courses will testify to the observation that 
EAP learners commonly have serious difficulties with the range of choices involved in 
reporting, choices of syntactic form, tense, voice, reporting verbs, etc. Thus we consider 
reporting to be an important feature of journal articles that needs to be researched from all 
possible angles in order to provide input for pedagogic decisions. 
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FOCUS OF THE INVESTIGATION AND HYPOTHESIS 

A previous study of reporting in academic articles that we carried out from the 
perspective of prediction (Tadros, 1985) suggested that some of the formal linguistic 
manifestations of the function of reporting in these texts had not been noted before. Thus, 
it seemed that a logical starting point for the study would be an examination of the formal 
and syntactic choices available to writers and the conditions governing such choices. 

It was obvious, very early in the study, that there was a great deal of variation in the 
forms of the reports in academic articles. One factor responsible for the variation seemed 
to be the writer’s choice of a particular element of the message as theme of the reporting 
statement. Our purpose in this research was to examine thematic options and their 
distribution to discover patterns of choice characteristic of reporting and of the research 
articles in which they occur. 

The hypothesis that the choice of thematic element governs the syntactic forms of 
reports seemed plausible as a starting point. Furthermore, we wished to consider if there 
was any association between the different syntactic forms resulting from different thematic 
choices and the function of reports in their contexts, or the co-occurrence of these forms 
with certain discourse features. 

The corpus for this study consists of 1 1 research articles on psychosomatic medicine 
taken from the British journal entitled Journal of Psychosomatic Research. They were 
selected at random, provided that they were not review articles, which we consider to have 
a somewhat different purpose. These articles provided the 129 reports for analysis as well 
as the contextual framework for the proposed investigation of form and function. 



THEORETICAL BASIS FOR ANALYSIS OF THEME 

The theoretical framework for this thematic analysis of reports draws on Halliday’s 
notions and categories of theme. We have selected those which seemed most likely to prove 
revealing about the data and then proceeded to adapt or modify them as required. 

Theme is a subsystem of the textual component, the third of the three major systems 
related to the ideational, interpersonal and textual functions in Halliday’s theory of language 
(Halliday, 1985). Theme concerns the structural relations in the syntactic units of 
clause/sentence. Distinguishing theme from the notion of given, Halliday describes it as the 
point of departure of the message (ibid., p. 38), and identifies it with the initial position in 
the clause. The remainder of the clause is the rheme. A message, then, consists of the 
structure THEME -I- RHEME. Halliday’s notion of a multiple theme also proved a useful 
analytic tool for study of the role of theme in reports. The theme of a clause can be a 
simple theme, functioning as the subject, complement, or adjunct. On the other hand, it can 
be a multiple theme with an internal structure made up of the three metafunctions of 
language-the ideational, the interpersonal, and the textual. The ideational element is 
obligatory in theme structure, while the textual and interpersonal elements are optional. The 
ideational contribution to theme structure is termed topical theme and is the element 
functioning as subject, complement, or circumstantial adjunct. The textual elements 
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contribute meaning by relating the message to the preceding/following text and the context. 
The interpersonal component of theme is generally made up of modal expressions which 
express the writer’s judgment regarding the message. 

For Halliday, theme ends after the first ideational element which may not necessarily 
be the grammatical subject of the sentence, but it includes any preceding textual or 
interpersonal elements. The example below from a text in Hoey (1986) illustrates this. 

Now, after a ten-year programme , the Scottish plant breeding station thinks 
it has developed its own high enzyme barley.,,, [underlining added] 

In this example, the theme (underlined) does not include the grammatical subject {the 
Scottish breeding station) since the first ideational element ends with ten-year programme. 



BASIC ASSUMPTIONS 

In what follows, we state our position on theme, since our interpretation of what is 
thematic in a sentence differs somewhat from Halliday’s. 

1. We consider that it is more revealing to extend theme to include all elements up to 
and including the grammatical subject of the sentence. Our investigation is concerned with 
the aboutness of the reports which, in unmarked sentences, is conveyed through the 
grammatical subject. We draw support for the decision to include the grammatical subject 
as part of theme from Chafe’s (1976) comments on the function of theme, and also from 
Davies (1989). By including the subject with all the elements preceding it, the analysis will 
also give clearer insights into the ways in which thematic elements function as the point of 
departure and provide the orientation for the message in the rheme, 

2. This analysis is restricted to themes at sentence level, i.e. of independent clauses 
which are initial in the sentence. Themes of bound clauses or groups have been ignored 
because we consider that independent clauses make a greater contribution to the discourse 
flow than other clauses. If a bound clause is initial in a hypotactic sentence, the whole of 
the bound clause is treated as thematic. 

3. Sentences with existential subjects-/^ and there-wexe^ considered to be of special 
significance. However, these items and the following be verbs are empty of ideational 
content. They are only weakly thematic because the grammar requires them to be fixed in 
sentence initial position, and it is not a matter of choice. Therefore, the elements following 
these existential subjects, i.e. the first ideational elements, were taken as theme, for example 
evidence, in the following example. 

There is also evidence that material response to stimulation can affect the 
foetal heart rate, [underlining added] 
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ANALYTICAL PROCEDURES 

1 . Themes were identified by picking out the grammatical subject of the sentences as 
the obligatory element and then including all items preceding it-the textual, interpersonal, 
and ideational elements. The relative distributions of these options were calculated. 

2. An important objective of the study was to investigate the functions of the typical 
thematic structures and combinations of different elements. 

3. The term adjunct is interpreted broadly. We treat the combination of a preposition 
and non-fmite clause as a special class of prepositional phrase rather than as a class of 
subordinate clause. 



THEMATIC CATEGORIES 

The inclusion of subject in theme allowed us to categorize themes of reports in terms 
of the kind of participant chosen for subject position. It is the choice of thematic elements 
that determines the form of the reports and therefore can provide a basis for categorizing the 
reports. 

Three main options open to the writer in the choice of theme can be identified on the 
basis of the type of participant subject which forms the headword in the thematic element. 
These are: 

A. Choice of agent as headword of the theme. This refers to the thematic choice of the 
human agent of the process being reported. 

B. Choice of a text reference term as headword of the thematic unit. This refers to a 
limited number of metalingual terms which are used to (i) describe the source from which 
the reported information has been abstracted (e.g. other studies), or (ii) label the rhetorical 
function of the stretch of language to which they refer (e.g. hypothesis, conclusion, finding). 

C. Choice of a content term. The label content term refers to the choice of an item of 
the scientific content for headword in the nominal in theme position. This is closely 
associated with a class of reports that we term averred reports. 

Within these three main choices for theme, variations are created when textual, 
interpersonal, or ideational elements in the form of circumstantial adjuncts work in 
combination with the subject headword. First position in theme structure can be taken up 
by an adjunct which specifies the circumstances under which the participant was involved 
in a process. These adjuncts provide a context by giving the reader a temporal, locational, 
or situational framework for what follows in the rest of the sentence. Conceivably, the 
frameworks created by such adjuncts might extend over a stretch of text beyond the limits 
of the sentences in which they occur. For the sake of brevity, only the main categories for 
thematic choice are described below, although analysis at a more delicate level was carried 
out. 
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AGENT AS THEME 



Patterns of Distribution 

Of the 129 reports, 51 or 38.5% had an agent headword. Agent themes can take one 
of two forms. One possibility is that the name of the researcher is specified as agent of the 
process. If more than two researchers are involved, generally one name followed by et al. 
is used, e.g. Morgan et al. Another possibility is that a generalized term is used for several 
researchers. The choice of a generalized term is determined by the fact that reference is 
being made to the authors of a number of different publications. Because of this, the et al 
form cannot be used, and it is also not possible to name all the authors involved in the 
running text, though, of course, the full reference is specified in the bibliography. Such 
terms are generally preceded by quantifiers such as many, several 

By themselves or in combination with circumstantial and textual elements, the 
distribution of agent themes can be represented as in Diagram 1. 




Diagram 1. Distribution of agent themes. 



Discourse Functions of Reports with Agent Themes 

The most frequent choice for theme in reports, i.e. the named agent, seems to co- 
occur with a limited number of discourse features which are described below. 

Limited framework. In about 28% of reports with named researcher, the named 
researcher is preceded by an adjunct as the first element of theme. This adjunct serves to 
restrict the circumstances for interpreting the argument in the rheme or the message of the 
report. The following example illustrates how this works. 

In exposing four psychosomatic groups (rheumatoid arthritics. tension 
headache sufferers, migraneurs and hypertensives) to exercise. Anderson found 
that the more unpleasant a subject reported the stress to be, the lower that 
subjects physiological level of activation also tended to be. [underlining 
added] 

The underlined adjunct/participial bound clause in the report above is fronted as it 
has the function of specifying the limits within which the reported information claims to 
apply. It describes the experimental procedure, which is not so much what the writer is 
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interested in communicating, but what is necessary for the reader to be able to interpret the 
finding which is the point of the message. It provides the specific circumstances for which 
the findings have applicability or validity. Specifying this kind of limiting circumstance 
results in a narrowing down of the information being presented. At the same time, the 
adjunct/participial bound clause has an implicit agent, i.e. the one who exposed the four 
groups. The reader’s inference of an implicit agent creates an anticipation that there will 
immediately be more specific details telling the reader who the agent is and what the limiting 
circumstances apply to. The data shows that when text reference terms are in theme 
position, such circumstantial adjuncts do not co-occur. 

Genercdization in the immediately preceding text. There is a definite tendency (31 % 
of reports) for a generalization to be present in another sentence immediately prior to a 
report with a named researcher in theme position, when there is no circumstantial adjunct 
and sometimes when a circumstantial adjunct is also present. This does not refer to a 
generalization in the circumstantial adjunct itself when such an adjunct is present. The 
report with named researcher provides details or serves as an example of the generalization. 
The report below illustrates how this works in the data. 

Its etiology is unknown, although several theories have been put forth to 
explain it. Raynaud hypothesized an overactivity of the sympathetic nervous 
system leading to an increased vasoconstrictor response to cold [2] while 
Lewis postulated a local fault in which precapillary resistance vessels were 
hypersensitive to local cooling [3]. 



The bound clause in the first sentence is a general preview statement, and it requires 
realization of its particulars in the form of examples. Such examples/particulars usually have 
named researchers as first thematic element. When specifics are predicted by a general 
projecting statement, they cannot take the form of reports with text reference terms as theme 
as these are not sufficiently specific to meet the requirements of the general statement. 

Basis for a claim in preceding context. Approximately 30% of reports with named 
researcher as theme were functioning as the basis for a claim in the preceding context. In 
these cases, the first member of the relation tends to be a highly tentative claim made by the 
writer to account for a particular observation or result obtained earlier. What follows is the 
basis for the speech act of making the claim. This can be glossed as "I can say this because 
researcher X has shown Y." The predicted basis thus has to be a reference to a specific 
research experiment or study either by the writer in a previous work or by another 
researcher. In order for the report to function as a basis for the spieech act of making a 
claim, it is also necessary for agency to be specified in the report. Its strength as a basis 
lies in the fact that it is in an independent source, and this has to be emphasized by making 
the agent the thematic element. If the agent/named researcher is omitted, it becomes more 
difficult for the reader to perceive the relation of the report as a basis for the claim. The 
example below illustrates the thematization of naified reporter in a report in a Claim-Basis 
relation. 

1 . It appears that the two scales may differ in terms of their sensitivity. 2. 

Buros (32) suggests that the reliability of the POMS reflects its relative lack 
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of sensitivity to changes of state, as opposed to more stable traits, and thus 
may indicate that the test does not really measure fluctuating affective states. 

In sentence 1 above, the writer makes a tentative claim to account for results obtained 
using the two scales, POMS and the Nowlis. To justify making such a claim-to show that 
it is not unreasonable to suggest that the two scales may differ in their sensitivity-in sentence 
2, he describes the findings of another researcher which support his own claim. This same 
effect would not be achieved if the report were presented in averred form and without the 
reporting clause, so that it reads: 

The reliability of the POMS reflects its relative lack of sensitivity to changes 
of state, as opposed to more stable traits.... (32). 

When this version is juxtaposed with the claim in sentence 1, the Claim-Basis relation 
is no longer evident, and the discourse becomes incoherent. It is the thematization of the 
researcher in the reporting clause which allows the discourse relation to be perceived. 



TEXT REFERENCE TERM AS THEME 



Patterns of Distribution 

Of the 129 reports, 32 or 25% of the total had an text reference term as subject. A 
distinction is made between text reference terms which have an anaphoric function, such as 
test results, and text reference terms that are non-anaphoric, e.g. other studies. Adjuncts 
sometimes co-occur with text reference terms. The distribution of text term thematic 
elements is represented in Diagram 2. 




Diagram 2. Distribution of text term themes. 

Text reference terms are a significant option for theme in reports, although not as 
frequent in distribution as agent headwords. A high proportion of the text reference terms 
are anaphoric and relate to the preceding discourse. They generally have fronted deictics, 
e.g. such, these, but no adjuncts. It is hypothesized that the absence of such adjuncts has 
to do with the clause-relating function of anaphoric text reference themes. They serve to 
carry forward a report in order to elaborate on it or else to set up a relationship with another 
report. This aspect of the functions of theme will be dealt with in the next section on 
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clauses-relating functions of theme. It appears that adjuncts preceding the anaphoric text 
themes would interfere in some way with their clause-relating function. 

Discourse Functions of Reports with Text Reference Term Themes 

We have suggested that text reference terms which are anaphoric have a clause- 
relating role that will be discussed below. When non-anaphoric text terms are in the plural 
form, the reports in which they occur are generalizations or summarizing statements. The 
text terms chosen are either studies or research. Although it would have been possible to 
have the plural form of the agent (e.g. researchers, investigators) in subject position, in 
place of the text term in the generalizing/summarizing reports, interestingly this does not 
occur. Such plural forms are generally found only in the more specific supporting reports. 



CONTENT TERM AS THEME 



Patterns of Distribution 

Of the 129 reports, 33 or 25% had a content term in theme position. Approximately 
half of them occurred as the sole element of the theme, while the rest were mainly preceded 
by a textual adjunct. The types of content themes are represented in Diagram 3. 




Diagram 3. Distribution of content term themes. 

This clearly shows that interpersonal and circumstantial adjuncts were quite 
insignificant as choice, with only one and two occurrences respectively. When some item 
of the scientific content is thematized, the resulting report falls into one of two types. Either 
it is an averred report in which there is no reference to the source of the reported 
information either by a reporting structure with a reporting verb or a reporting adjunct, i.e., 
what we have is a report by footnote. Or it is a passivized report in which reporting 
structure is evident but there is no reference to the agent/reporter who is the source of the 
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reported information. The content is split so that after the theme elements, the passive form 
of the reporting verb is interposed. This is then followed by the rest of the content. An 
example is given next. The content terms have been put in wing brackets. 

{The vasopastic attacks of Renaud’s disease} are thought to be precipitated 
by cold, emotional stress, or a combination of the two [4]. 

Discourse Functions of Reports with Content Term Themes 

Content themes seem to be associated with the content in the rhemes of the preceding 
sentence(s). This suggests that the choice of a content theme as opposed to an agent theme 
or a text reference theme might be motivated by the need to maintain thematic continuity in 
one of its manifestations. The example which follows helps to illustrate the point being 
made. 



7. More importantly, living and working environments which involve stressful 
confrontations and demands upon the individual are often [1-4], but not 
always [5], associated with increased incidence of hypertensive disease in 
humans. (Rep B4) 2. The environments that are associated with an 

increased risk of hypertensive disease have been characterized as posing 
constaru threat and uncertainty [IJ. 

Report B4 above has a content theme which is derived from the preceding report and 
summarizes and carries forward the information that certain kinds of environment are 
associated with an increased risk of hypertensive disease. All this information is contained 
in the theme so that it serves as the take-off point for the new message in the rheme, i.e., 
that these environments have been characterized as posing threat and uncertainty. 

The point here is that the choice of the content theme in Report B4 was necessary and 
that the other options of an agent theme or a reference theme were then not available. Since 
the writer wanted to say something further about the reported information in Reports B2 and 
B3, it was necessary for him to retain items of the content of the previous sentence in a 
linear thematic progression. We can see how the coherence of the text as a whole is 
disturbed if we select a different kind of theme, e.g. agent as theme, as illustrated next in 
a modified version of the previous quotation. 

More importaruly, living and working environments which involve stressful 
confrontations and demands upon the individual are often [1-4], but not 
always [5], associated with increased incidence of hypertensive disease in 
humans. Researchers have [research has] characterized the environments that 
are associated with an increased risk of hypertensive disease as posing 
constant threat and uncertainty. 

In this changed version, we are being told something about researchers or about 
rese^ch and not about the environments associated with an increased risk, as in the original 
version. There is a need, then, to retain as theme what has been given in the previous 
sentence and it is this that helps to determine the choice of a content theme. 
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The thematic choice we have described so far for reports in general are represented 
in Diagram 4 below. Although this diagram describes the actual choices made, we suggest 
that it is closely relatable to a systematic network representing the potential choices available 
to the writer. Note that in this diagram, as well as in those above, percentage figures treat 
the category to the left as 100% base. For example, Named Agent is 90% of Agent themes 
and not of reports as a whole. 





Diagram 4. Summary of thematic choices available to the writer. 
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BOUND CLAUSE AS THEME 



Patterns of Distribution 

In a small number of the rq»rting sentences (7 out of 129), the choice for theme was 
a bound clause. Sometimes the thematized bound clause was a writer comment, while the 
rest of the sentence, the rheme, was the report. Another pattern was for the report itself to 
take up the whole theme. It was connected to a writer comment in the rheme. A third 
possibility was for one rq»rt to occur as the bound clause and another report to be the free 
clause in the same sentence. These choices for bound clause as a theme can be represented 
as in Diagram 5. 




Diagram 5. Choices available for bound clause themes. 



Discourse Role of Bound Theme 

Information in fronted bound clauses in rq»rts is often new information just like the 
information in the free clause. Rather than present them as two independent clauses linked 
by a sentence connector, the writer can present one as being of subordinate status relative 
to the other. The choice of bound clause as theme is, thus, seen to be determined by the 
need to specify two pieces of information and, at the same time, to mark the relation of one 
to the other as being of subordinate status. Once the choice for bound clause has been 
made, it then serves as the point of departure for the free clause which follows. 



PREDICATED THEME 



Patterns of Distribution 

Approximately 10% of the reports had a predicated thematic structure, i.e. with the 
form it + be or there + be. The network of options available with predicated themes is 
represented in Diagram 6. 
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Diagram 6. Choices available with predicated themes. 



Discourse Role of Predicated Theme Reports 

This form is used as a device to avoid having old/given information at the beginning 
of the report. If the writer wishes to foreground the new information, s/he can do this by 
having the empty subject there in thematic position. In the data studied, this device is 
adopted in order to foreground particular kinds of information. 

To assert the existence of evidence. The predicated there subject seems to be used 
when the writer is arguing for the existence of evidence of a certain kind in connection with 
some point s/he is making. These themes co-occur with nominals like evidence and support 
as in. There is increasing support for the hypothesis. 

In these structures there takes up the given position in the sentence. Since it has no 
informational value, the attention falls on the complement part. In this way the message in 
the rheme is foregrounded. The following example will help to make this clearer. 

1. Chronic bronchitic patients do seem to suffer from high levels of 
psychiatric disorder and a close liaison between chest physicians and 
psychiatrists or clinical psychologists may contribute to the effectiveness of 
patient management. 

2. There is some evidence that alleviation of psychiatric symptoms is 
associated with a reduction in breathlessness [14]. 

Sentence 1 above is a tentative claim by the writer. He then brings in evidence in 
the form of a report as the basis for making the claim. The predicated structure allows the 
foregrounding of the word evidence and the clause qualifying it, so that the relationship 
between the two sentences as claim and supporting evidence is highlighted. The semantics 
of the lexical items evidence and support carry an additional element of meaning which is 
not conveyed by other text reference terms, such as findings, research, and studies. 

To emphasize modality, i.e. possibility. The following report exemplifies the way in 
which an //-predicated theme has the role of emphasizing modality. 

It may be that patients show a form of expectancy learning, in which 
"physiological responses are brought under the influence of environmental 
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stimuli when events occur closely in time in a highly predictable relationship 
[26]. 



In the above, the writer uses the report as part of his explanation for some results that 
seem to be anomalous. He is offering the explanation very tentatively and this effect is 
achieved by predicating the modality as It may be. The effect of tentativeness is further 
increased by inclusion of part of the reported information as direct speech within the indirect 
speech. 



The data has indicated a general absence of lexical modal elements such as possibly 
in thematic position. When modality has to be shown in the reported proposition it is 
generally in non-thematic position, as in: Thus, expressiveness tends to moderate 

physiological responsivity. . . 

To omit agent as subject in a report. The thematic element in some reports is the 
form it + has been shown/suggested. This structure enables the writer to omit mention of 
the agent as subject in theme position and yet maintain the report structure if sh/he does not 
want to present it as an averred report, i.e. one which the reporting writer fully endorses. 
An example is given below. 

1. This test showed that the exercise group scored significantly better on all 
three parts of the test following exercise. 

2. No change was obtained in the control group. 

3. It has been suggested that naming a word is an automatic process 
whereas the name of a hue is the result of a conscious effort to choose and 
say the same of a colour. 

In this example, the message in sentences 1 and 2 states the results of the test. In 
the report in sentence 3, the writer wishes to introduce a research finding that has relevance 
to the results obtained in the test, i.e. that might help to explain them. As such, he does not 
want to make the researcher the point of departure. The sentence is not so much about what 
the researcher has done as about what throws light on the results described in sentence 1 . 



CONCLUSION 

This paper has discussed the way in which the choice of theme affects the syntactic 
form of the reports. The notion of theme used here is an extended one which includes the 
grammatical subject of the clause as it is more likely to give information on the "aboutness” 
of reports. 

It was suggested that a typology of reports based on participant subject in the theme 
element can be drawn up. Reports were categorized as having (a) agent themes, (b) text 
reference themes, and (c) content term themes. With these three main choices for theme, 
variations are created when textual, interpersonal, or ideational elements in the form of 
circumstantial adjuncts work in conjunction with the subject headword. 
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Additional thematic choices are bound clause and predicated it and there. The 
thematic choices for reports in terms of their correlation with discourse features were 
discussed. Where there is a limiting circumstantial adjunct as initial element, the choice for 
the headword of the theme is an agent rather than a text reference term or a content term. 
Also it was noted that an agent theme is often preceded by a generalization in the prior 
context. Reports with anaphoric text term themes are generalizations or summarizing 
statements. It was also noted that the choice of content term themes might be motivated by 
the need to maintain thematic continuity. Predicated themes were seen to have the discourse 
functions of asserting the existence of evidence, emphasizing modality, or omitting the agent 
as subject. 
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After a review of literature on connected speech modifications in English, this 
paper reports on a study of the connected speech modifications that five 
intermediate-proficiency (IP) and five high-proficiency (HP) Japanese ESL learners 
used in sentence reading and spontaneous speech tasks, and compares their 
performance to that of five native speakers (NS) of American English. Speech 
samples from the fifteen subjects were transcribed phonetically and analyz^ for 
linking, flapping, vowel reduction, and consonant cluster simplification. The 
results indicate that while the HP group approximated the performance of the NS 
group in several categories, the IP group often lagged far behind. Furthermore, the 
IP group used forms predicted by native language transfer theory more often than 
did the HP group. All three groups produced significantly more modifications 
during the spontaneous task than during the sentence reading task. The report 
concludes that language proficiency, native language, and style shifting are 
important factors affecting Japanese ESL learners’ connected speech. 



INTRODUCTION 



Connected Speech Simplifications of Native Speakers 

In the pronunciation of English it is important to distinguish the citation form and the 
connected speech form of a lexical item. Citation forms occur in isolated words and in 
words under heavy stress in sentences delivered in a slow, careful style (Lass, 1984). Such 
forms closely match the pronunciation predictions derived in generative phonology by 
applying phonological rules to underlying phonological representations (Chomsky & Halle, 
1968). By contrast, connected speech forms often show a variety of simplifications (Brown, 
1977; Lass, 1984, Dalby, 1986; Hieke, 1987; Temperley, 1987) which are not so easily 
predicted by the phonological component of the language. 

Lass (1984) observes that the disparity between citation forms and connected speech 
forms is sometimes so great that it appears that the speaker possesses two different 
languages. He attributes connected speech simplifications not only to surrounding sounds 
but also to speaking rate, the formality of the speech situation, and other social factors 
related to speech variation: 

What appears to happen is that the faster and more casual speech becomes, the less 
it is ‘focal’ to the speaker’s concern, the less attention he pays to it. Therefore, the 
inertial properties of the speech apparatus tend to t^e over, as if it were a 
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‘gravitational’ effect where decrease of attention leads to decrease of effort. To put 
it crudely, things tend to get done the easiest way, movements flow along a path 
of least effort. As attention decreases, so does control; and both distinctiveness and 
distinction decrease (Lass, 1984, p. 297-298). 

The major characteristics of connected speech that Lass identifies are (1) more frequent 
assimilation, in which the distinctiveness among adjacent sounds is lost; (2) a blurring of 
boundaries and a reorganization of phonetic material; (3) lenition, or a lesser degree of 
closure in the vocal tract, (4) vowel reduction, by which is meant vowel centralization and 
shorter vowel duration, as well as possible vowel loss leading to the syllabification of 
consonants, (5) a shorter duration of long sound segments, and (6) the deletion of consonants 
in consonant clusters. 

Sometimes the restructuring can be quite radical, as when phonotactic constraints are 
violated. For example, although the sequences [ts] and [ks] are not allowed word-initially 
in English, Lass (1984, p. 299) reports the following in his own casual speech: 

(1) [tsilmz] for it seems 

(2) [kstriimlii] for extremely 

Radical restructuring can also modify the form used to express a grammatical meaning. Lass 
(1984:301), for example, has observed that in very casual speech a speaker may express the 
definiteness of a noun by lengthening its first segment rather than by using the definite 
article. 

(3) Put [milk] on the table. [= milk] 

Put [miilk] on the table. [= the milk] 

The social distance of interlocutors also affects the frequency with which such 
modifications occur. The more similar the speaker and listener are in terms of social group 
and background knowledge of the topic under discussion, the greater the listener’s ability to 
reconstruct the meaning of the message, and therefore the less attention the speaker is 
required to pay to clear articulation. 

Hieke (1987) discusses many of the same connected speech phenomena as Lass (1984), 
although he uses the term absorption for connected speech modifications and sets up different 
categories for classifying them. Of particular interest in the present study is his discussion 
of linking, or resyllabification, which he views as evidence of the spe^er’s tendency to 
avoid hiatus in casual speech. In the three types of linking he describes— consonant 
attraction, glide attraction, and release attraction-a segment in word-final position is 
reassigned to the first syllable of the following word. Consonant attraction is illustrated in 
the following phrase, where a period [.] represents a syllable boundary. 

(4) [graeb.dit] for grabbed it 

Ambisyllabicity, another syllabification process that Hieke (1987) and especially 
Trammel (1992) discuss, occurs when a consonant cannot be assigned exclusively to one 
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syllable or another but is shared by both. In the following example parentheses identify the 
ambisyllabic segments. 

(5) [?ai.ge(s)i(r)iz] for I guess it is. 

Although some controversy surrounds ambisyllabicity (Trammel, 1992), the present study 
accepts the notion and treats all ambisyllabic segments at word boundaries as examples of 
linking. 

Brown (1977) also discusses connected speech and presents a detailed description of 
modifications resulting from assimilation and deletion. She also includes an account of 
vowel simplifications characteristic of the RP (received pronunciation) variety of British 
English and the reduction of visual clues in connected speech. 

Dalby (1986) investigated deletion of unstressed syllables in the connected speech of 
native speakers of American English. He transcribed and analyzed television news 
interviews and data from three native speakers who produced slow and fast versions of a set 
of test sentences. His results show that the rate of weak syllable deletion increases as 
speaking rate increases. He also found that the position of an unstressed syllable in a word 
and the segments surrounding it influence the rate of deletion. 

Connected Speech Simplifications in ESL Pronunciation 

Teaching Materials. ESL literature on pronunciation instruction has addressed the issue 
of connected speech modifications but is not unanimous on how much to teach to ESL 
learners. On one side. Brown (1977) argues for teaching connected speech modifications 
fairly early for comprehension purposes but not for production purposes. Instead, ESL 
learners should be left to acquire them on their own when they reach a more advanced’ stage 
of language development. On the other side, Temperley (1987) argues that linking and 
certain types of consonant deletion should be taught for production. Avery and Ehrlich 
(1992) state that it is also important to teach vowel reduction and certain types of 
assimilation. ^ 



Authors of recently published ESL pronunciation textbooks (Chan, 1987; W. Dickerson 
1989; Gilbert, 1992; Dauer, 1993; Grant, 1993; Lane, 1993) seem to take the latter 
position, including in their texts production exercises on some of the same connected speech 
modifications discussed above. For example, Dauer (1993) has production exercises for 
linking, vowel reduction, syllabic consonants, contractions, and consonant cluster reduction. 

The inclusion of such simplifications for production practice is probably due to the 
greater emphasis in ESL pronunciation instruction on suprasegmentals in the last decade or 
so (Gilbert, 1984; Pennington & Richards, 1986; Wong, 1987; W. Dickerson, 1989- 
Anderson-Hsieh, 1990, 1992). The assumption is that ESL learners who use some of the 
same connected speech simplifications that native speakers use can more readily approximate 
native-hke patterns of timing, stress, and rhythm (W. Dickerson, 1989). However, it is 
important to note that none of the pronunciation textbooks on the market include exercises 
on the more radical types of restructuring that Lass (1984) describes. 
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Research. In spite of the importance of connected speech in recent ESL pronunciation 
materials, little research has documented whether ESL learners actually acquire connected 
speech modifications. The only aspect of speech modifications to receive much attention has 
been the simplification of consonant clusters through two competing strategies: consonant 
deletion and vowel epenthesis. Consonant deletion is the strategy of native speakers of 
English in which they drop one member of a consonant cluster. Vowel epenthesis, a strategy 
sometimes used by children acquiring English as their first language (Oiler, 1974) and by 
ESL learners from certain language backgrounds, involves inserting a vowel before, after, 
or in the middle of a consonant cluster. Although vowel epenthesis at first glance would 
seem to run counter to simplification because it adds rather than deletes a sound segment, 
it is considered to be a simplification process because it breaks up a consonant cluster into 
two syllables which are easier to pronounce than the original sequence of consonants. 

Several empirical studies have investigated syllable simplification in learners from 
different language backgrounds (Tarone, 1980; Broselow, 1983, 1984; Anderson, 1983, 
1987; Sato, 1984; Riney, 1990; Eckman, 1991). The focus of these studies has been the 
relevance of syllable simplification for second language acquisition theories. For example, 
Anderson (1987) has shown that while native language transfer is a major factor in the 
frequency of simplification and the choice of simplification strategy, language universals 
(Greenberg, 1978) also play a role. Canonical syllables that are universal across languages 
are easier to learn, while canonical syllables that occur less often are more difficult. 

The only study which investigates a broader range of connected speech phenomena and 
compares the modifications of ESL learners with those of native speakers is by Hieke (1987). 
He investigates flapping as well as consonant cluster reduction in a group of intermediate 
ESL learners from unspecified language backgrounds and compares their simplifications with 
those of native speakers of American English using the same speaking task, namely, a 
paraphrase task in which speakers retell a story just heard. In his analysis of recorded 
speech samples, Hieke discovered that the native speaker group used flapping and cluster 
reduction significantly more often (approximately 30% more so) than the nonnative group. 

Although Hieke’s research represents an important contribution to second language 
phonology, his study leaves several questions unanswered. First, even though his ESL 
subjects represented several language backgrounds, he did not investigate the effect of the 
learners’ native phonologies on the rate of modification and the kinds of modifications that 
occurred. Secondly, he did not investigate whether his subjects used other strategies besides 
consonant deletion, such as vowel epenthesis, to simplify consonant clusters. Thirdly, since 
his study looked at only intermediate ESL learners, the extent to which advanced ESL 
learners approximate native-speaker modifications is not known. And finally, by restricting 
his data to casual, spontaneous speech, Hieke did not investigate the extent to which style 
shifting (Labov, 1972) affects the connected speech modifications of ESL learners, even 
though style shifting has been shown to operate in other aspects of the ESL learner’s speech 
(L. Dickerson, 1974; Schmidt, 1977; Beebe, 1980; Major, 1987). 

The purpose of the present study is to investigate connected speech modifications in one 
group of ESL leamers-native speakers of Japanese-at both the intermediate and advanced 
levels of proficiency and to compare their modifications with those of native speakers of 
American English on the same speaking tasks-a sentence reading task and spontaneous 
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speech. Japanese ESL learners were chosen because the canonical syllable structure of 
Japanese differs greatly from that of English, and Japanese does not reduce vowels in the 
same way that English does. These contrasts between Japanese and English suggest that 
Japanese ESL learners are likely to modify certain aspects of connected speech differently 
from the way that native speakers of English do. 

Relevant Contrasts between English and Japanese Phonology 

While English allows consonant clusters of three members in word-initial position and 
clusters of four members in word-final position, Japanese allows only consonant-glide 
clusters word-initially and no clusters in final position. In fact, only one consonant appears 
in word-final position, namely, /n/, which alternates in certain contexts with the /g/ (Vance, 
1987). This radical difference in syllable structure suggests that Japanese ESL learners are 
likely to simplify English syllables even more often than native spiers of English. 

However, whether Japanese ESL learners will simplify clusters through consonant 
deletion is pother question. When foreign words with consonant clusters are incorporated 
into the Japanese language, they are simplified in the Katakana syllabary by inserting vowels 
between any adjoining consonants so that their syllable structure conforms to the canonical 
syllable structure of Japanese. Thus when speaking English, Japanese learners may show a 
tendency to simplify clusters by inserting vowels. This is the view apparently taken by 
Thompson (1987), who claims that the major simplification strategy used by Japanese ESL 
learners is vowel q)enthesis. He illustrates the difficulties Japanese ESL learners have with 
connected English speech in the following sentence from a hypothetical Japanese learner (p. 
215). All of the consonant cluster simplifications involve vowel epenthesis rather than 
consonant deletion. 

(6) They*ll have to make a bigger effort in fiiture. 

[zeiru habu tsu: meiku a bii}a: eho:to ii: gt*:tja:] 

On the other hand, a study of bimorphemic clusters in the spontaneous speech of 
Japanese ESL learners (Saunders, 1987) shows that they use both consonant deletion and 
vowel epenthesis as strategies for simplifying the clusters. It is not known if this pattern of 
simplification holds for monomorphemic clusters as well. 

In addition to contrasts in syllable structure, Japanese and English also treat unstressed 
vowels differently. In Japanese, vowels that occur in voiceless contexts are often devoiced, 
and vowels are sometimes deleted (Vance, 1987). However, vowels are never centralized 
as they are in English. This would seem to predict that Japanese ESL learners will tend not 
to centralize vowels in unstressed syllables but will devoice and delete them instead. 



Research Questions 

The present study poses several questions concerning the connected speech of Japanese 
ESL learners: (1) To what extent do intermediate- and high-proficiency Japanese ESL 

speakers differ from native speakers of American English in their use of connected speech 
modifications? (2) To what extent do intermediate- and high-proficiency Japanese speakers 
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differ from each other in their use of connected speech modifications? (3) What effect does 
native language transfer have on connected speech modifications? In particular, how does 
it affect the rate of vowel reduction and consonant cluster simplification and the choice of 
strategy for simplifying consonant clusters? (4) What is the effect of style shifting on 
connected speech modifications? That is, is there a difference in the rate of connected 
speech modifications in spontaneous speech compared to sentence reading? 

Connected Speech Modifications Investigated 

The connected speech modifications investigated in this study are alveolar flapping, 
linking, vowel reduction, and consonant cluster simplification. Each is described below. 

Alveolar flapping. In alveolar flapping, an alveolar flap replaces an alveolar stop. 
Although flaps and stops involve closure, the duration of closure is shorter for a flap. The 
flap is illustrated below in two different contexts-within a word, (7), and at a word 
boundary, (8). In all cases, the flap is ambisyllabic. 

(7) [lc(r)9r] for letter 

(8) [jD(r)ai.p»'D(r)i(r)an] for should I put it on 

Linking, As Heike (1987) has observed, linking can occur in English between two 
consonants, between a consonant and a vowel, or between two vowels. In consonant-to- 
consonant (CC) linking, in which the consonants are identical, only one consonant is realized 
and may be slightly prolonged, as in (9) and (10). 

(9) [6ae?.t^aim] for that time 

(10) [lE(s:)mi0] for Les Smith 

In other types of consonant-to-consonant linking, release attraction can occur, as illustrated 
below. In (11), the final stop of a word is released at the beginning of the following word. 
In (12), where the first consonant of the second word is a nasal, the release of the stop is 
nasal rather than oral. 

(11) [?ae?.t0E+.m0z] for at Thelma*s (oral release) 

(12) [?ae.d"mor] for add more (nasal release) 

Consonant-to-vowel (CV) linking involves the assignment of the final consonant of a word 
to the following, vowel-initial syllable: 

(13) [kain.dsv] for kind of 

Vo wel-to- vowel (W) linking occurs when a word-final tense vowel is followed by a word- 
initial vowel. This kind of linking often involves glide attraction as in (14). 

(14) [se.jit] for say it 

Vowel reduction. Vowels that are full in stressed syllables are often reduced or 
centralized in unstressed syllables, as illustrated below. 
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(15) [ye] for you 

Sometimes the vowel in the unstressed syllable completely disappears, eliminating a syllable 
or leaving a syllabic consonant, as in (16). Since such syllabic consonants are the result of 
vowel reduction, they are included in the vowel reduction category. 

(16) Iq] for and 

Consonant Cluster Simplification, Two types of consonant cluster simplification to be 
investigated are consonant deletion and vowel epenthesis. As noted, consonant deletion is 
the more native strategy for simplifying clusters, while vowel epenthesis is the strategy 
predicted to be used by the Japanese speakers. Consonant deletion occurs when two or more 
consonants appear together at a syllable or word boundary, as in (17) and (18). 

(17) Istrei3.li] for strangely 

(18) [laes.nait] for last night 

Vowel epenthesis can occur in the same environments as consonant deletion as the following 
examples illustrate. 

(19) Istre1n.d3i.li] for strangely 

(20) [laes.tu.nait] for last night 



METHOD 

Subjects 

The native speaker group (hereafter NS group) consisted of five native speakers of 
American English. Four of the speakers were native Californians and one was a native 
Iowan; three were male, and two were female. The Japanese speakers were selected from 
the population of Japanese students at International Christian University in Tokyo who take 
English proficiency tests as part of their requirements for admission. Their selection was 
based on an impressionistic judgment of their speaking proficiency. Five of the speakers 
were judged as having intermediate speaking proficiency and five were judged as having high 
proficiency. In the intermediate-proficiency group (hereafter IP group), three of the speakers 
were fem^e and two were male; in the high-proficiency group (hereafter HP group) two of 
the speakers were female and three were male. 



Materials and Procedures 

Use of an abbreviated version of the SPEAK test confirmed the original impressionistic 
grouping of speakers. Six evaluators with some training in SPEAK test administration rated 
the pronunciation, fluency, and comprehensibility of the nonnative samples on a four-point 
scale. Each speaker received one score, computed as the mean score of the six raters on the 
three SPEAK subtests. The mean score for the intermediate-proficiency (hereafter IP) group 





38 



was 1.4 on a scale from 0 to 3, the scores ranging from 1.22 to 1.62. The mean score for 
the high-proficiency (hereafter HP) group was 2.5, the scores ranging from 2.12 to 2.80. 

To elicit connected speech simplifications, the test required subjects to perform a 
sentence reading task and a spontaneous speech task. The sentence reading task was used 
to control for vocabulary, grammar, sound segments and consonant clusters, thereby making 
possible a more reliable comparison among the speakers. The sentences contained a high 
concentration of word-boundaiy consonant clusters, providing many opportunities for linking 
and consonant cluster simplification. An exercise on consonant clusters in an ESL 
pronunciation text (Prator & Robinett, 1985) was the source of the reading task sentences. 
A copy of the test is in the Appendix. 

The purpose of the spontaneous speech task was to elicit a less formal speech style in 
order to compare certain aspects of spontaneous production with sentence reading. The 
subjects were instructed to speak on the topic, "the most exciting or dangerous experience 
that I have ever had.” The spontaneous speech samples generally did not contain as high a 
concentration of consonant clusters as found in the sentence reding. The lack of control 
over vocabulary also resulted in speech samples that did not contain the same types of 
consonant clusters. 

The authors recorded the speech samples using a high quality tape recorder and analyzed 
them auditorily. With the aid of the varispeech function of the tape recorder, the first and 
second authors transcribed the samples in moderately narrow phonetic detail. A speed 
reduction of 15% made it possible to identify more easily word boundary phenomena such 
as release attraction. When spot-checking each others’ transcriptions, the authors found 
themselves in high agreement. 

Analysis of Speech Samples 

For each subject, a computation based on a count of potential and actual modifications 
of sentences yielded a percentage of flapping, linking, vowel reduction, and consonant 
cluster simplification. Excluded from the count of potential flappings, linkings, and 
consonant cluster simplifications were instances in which a pause (determined auditorily, not 
acoustically) occurred at word boundaries, A closer analysis of the linking category 
distinguished instances of CC, CV, and VV linking. Similarly, consonant deletion and 
vowel epenthesis became subcategories of consonant cluster simplification. This study, 
however, did not examine the length or segmental composition of the clusters. 

The analysis of spontaneous speech data proceeded as described with one exception. 
The study limited the high-proficiency data to two hundred and fifty syllables of continuous 
speech per subject but included all of the intermediate-proficiency data because each speaker 
produced fewer than 250 syllables. 

The data served to determine the percentage of simplification in the reading and 
spontaneous speech of each individual in the study and of the NS, HP, and IP groups. The 
final step involved subjecting the data to (1) the one-way analysis of variance and the Scheff6 
Test to examine the differences among the three groups on all categories and (2) the t-test 
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for paired observations to examine the differences between two groups and between the 
sentence test and spontaneous task for each group (Hatch & Lazaraton, 1991). 

Before reporting the results of the study, we should highlight two of its limitations. First, 
since the study is macroanalytical in its concern with broad categories of connected speech 
modifications, distinctions within each category are not examined. Thus, the reader should 
not assume that all of the forms in each category were modified in the same way or with the 
same frequency. Second, the number of subjects who participated in the study is relatively 
small, limiting to some extent the generalizability of the findings. 



RESULTS AND DISCUSSION 



Sentence Test 

The bar graph in Figure 1 presents the mean percentage of speech modifications in the 
sentence-reading data of the NS, HP, and IP groups. The NS group displayed more linking, 
flapping, vowel reduction, and deletion than the HP group, and the HP group showed more 
than the IP group. As expected, epenthesis did not occur among the NS group. The IP 
group, however, used more epenthesis than the HP group. 

Flapping and Linking. The analysis of variance for the flapping category was significant 
at the 0.0063 level (F = 7.95, df = 2, 12). The SchefK test showed a significant difference 
between the IP group and the NS and HP groups (p > 0.05) but not between the NS and 
HP groups. This indicates that the HP group is similar to the NS group in the use of 
flapping, while the IP group has not yet mastered flapping to the same extent. 

An analysis of the forms occurring in the IP group data showed a tendency to use an 
aspirated alveolar stop instead of the flap and to insert a glottal stop before the vowel 
beginning the following word: 

(21) [daet^.Tof] for that all 

When the IP speakers did use flapping, it happened most often within a word rather than at 
word boundaries: 

(22) [yunaind] for united 

Figure 2 presents an analysis of the CV, CC, and VV subcategories of linking. The 
percentage of linking in each subcategory is highest among the NS group and next highest 
among the HP group. Statistical analyses were done separately for each category of linking. 

The analysis of variance was significant for the CC category (F = 11.55, df = 2,12, 
p = 0.002). The ScheffS test revealed the same pattern of differences that was found for 
flapping. Significant differences were found between the IP group and the NS and HP groups 
but not between the NS and HP groups, indicating that the HP group was using CC linking 
with a frequency similar to that of the NS group. An analysis of the forms used showed a 




42 



63.9 




Figure 1. Group Mean Percentage Rates for Linking, Fiapping, Vowel Reduction, Consonant 

Deletion, and Vowel Epenthesis on the Sentence Test. 




Figure 2. Group Mean Percentage Rates for CV, CC, and W Linking on the Sentence Test. 
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tendency by the IP group for separate release of consonants as in (23), while the HP and NS 
groups more often showed release attraction as in (24). 

(23) [hard.Oaet] for heard that 

(24) [har.dOaet] for heard that 

The means for the CV category in Figure 2 show a pattern similar to the pattern for the 
CC category, but the differences in means were only close to being significant because of 
larger within-group variation (F = 3.28, df = 2, 12, p = 0.07). An analysis of the forms 
used revealed that the IP group showed a strong tendency to keep word boundaries intact 
by inserting a glottal stop before the word-initial vowel in the second word as in (25). The 
HP group also showed the same tendency to insert glottal stops, although they did so less 
often. 

(25) I?aend.?aeskt] for and asked 

The analysis of variance for the VV category was significant (F = 5.15, df = 2,12, 
p = 0.02), ^though the Schefffi test revealed a significant difference (p > 0.05) only 
between the NS and IP groups. The HP group was not significantly distinct from the other 
two groups, in spite of the large difference in means between the NS and HP groups. This 
lack of significance is probably attributable to the large degree of variance found. A study 
of the forms used by the IP group showed a strong tendency to insert glottal stops between 
vowels at word boundaries, as in (26). The HP group also frequently inserted glottal stops 
at word boundaries, although not quite as often as the IP group. 

(26) [6i.?A.69rz] for the others 

In summary, (1) the IP group generally used flapping and linking significantly less often 
than the NS group did, and the quantitative differences were often dramatic; (2) the HP 
group did not differ significantly from the NS group on either flapping or linking, although 
a large but non-significant difference in means surfaced in the VV linking category; (3) the 
HP group differed significantly from the IP group on flapping and CC linking, but nowhere 
else. 

Thus while the HP group approached the NS group in its rate of flapping and linking, 
in most cases the IP group lagg^ behind. The analysis of forms for flapping and linking 
showed that the IP group demonstrated a tendency to preserve word boundaries. In CC 
clusters, this was accomplished by maintaining a separate release of the consonants. In CV 
and W clusters, the word boundaries were maintained through the insertion of glottal stops. 
The HP group also inserted glottal stops before words beginning with vowels, although not 
as often as the IP group. 

Vowel Reduction, The mean scores for all three groups on the vowel reduction category 
are presented in Figure 1. The analysis of variance was significant at the 0.(XX)7 level (F 
= 14.35, df = 2, 12) although the Schefte test showed a significant difference only between 
the NS group and the HP and IP groups. No significant differences appeared between the 
HP and IP groups. An analysis of the forms used by the HP and IP groups showed that both 
groups reduced their vowels mainly in the definite and indefinite articles. They rarely 
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reduced vowels in words such as you or to or in unstressed syllables in words such as 
request. This failure to reduce vowels may arise at least in part from native language 
transfer, since Japanese vowels retain their pure quality and are never centralized as they are 
in English. 

It is interesting to note that the HP group was similar to the IP group on the three 
categories of connected speech involving vowels: CV linking, VV linking, and vowel 
reduction. Although the failure to reduce vowels may be due at least in part to native 
language influence, the failure to link word-initial vowels with other segments at word 
boundaries cannot be so explained. It is possible that the tendency to keep vowels intact may 
be related to a concern for intelligibility, although it is not clear why vowels, and not 
consonants, would be singled out. 

Consonant Cluster Simplification, The differences between Japanese and English 
syllable structure raised three questions about cluster simplification. The data reflected in 
Figure 1 helps to answer these questions. First, is the total consonant cluster simplification 
rate higher for the Japanese groups than for the NS group? The simplification rate of the 
IP and HP groups is the combination of consonant deletion and vowel epenthesis categories, 
while the simplification rate for the NS group reflects only consonant deletion, since no 
instances of vowel epenthesis occurred. 

The total rate of simplification was 33.9% for the IP group and 23.0% for the NS group 
The t-test revealed borderline significance (t = 2.20, p = 0.059, df = 8). The simplifi- 
cation rate for the HP group was 23.6%. No significant differences were found between the 
HP and NS groups. These results indicate that native language transfer influenced the total 
rate of simplification only for the IP group. 

The second question concerned the strategy for simplification*-epen thesis or deletion. 
Which strategy would the Japanese speakers use most often? Within each of the Japanese 
groups, the deletion rate was higher than the epenthesis rate, although it was significantly 
higher only for the HP group (t = 7.54; p = 0.0017; df = 4). While these results confirm 
the prediction of native language transfer theory that the Japanese group would use 
epenthesis as a simplification strategy, they do not support the prediction that it would be 
used more often that deletion. The results certainly do not strengthen Thompson’s claim 
that consonant clusters in Japanese are simplified solely through epenthesis. Instead these 
findings match those of Saunders (1987) which showed that Japanese ESL learners use both 
strategies of simplification. 

These results are also consistent with a study of Egyptian Arabic learners of English 
who showed a higher rate of deletion than epenthesis despite contrastive analysis predictions 
to the contrary (Anderson, 1983). This preference may arise from the fact that deletion is 
a more natural or universal process than epenthesis in the sense that children use it more 
often than epenthesis to simplify consonant clusters when acquiring their first language 
(Oiler, 1976). 

The third question about consonant cluster simplification concerned the extent to which 
the groups differed from each other in their rates of simplifying clusters through deletion and 
epenthesis. For consonant deletion, the analysis of variance revealed no significant 
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differences among the three groups. The consonants deleted most often by all three groups 
were /t/ and /d/, although this tendency was observed more often in the NS group. 
However, the Japanese groups also occasionally deleted consonants not deleted by the NS 
group, namely, final /!/, /m/, and /n/. 

These results differ from the findings of Hieke (1987) in which the consonant deletion 
rate of the native speakers was significantly higher than that of the nonnative speakers. 
Since his study does not mention the native language backgrounds of his subjects, one can 
only speculate that the syllable structure of their native languages did not contrast as radically 
with English as Japanese syllable structure does. 

As for epen thesis, the t-test indicated that the IP group used this strategy significantly 
more often than did the HP group; the differences were significant at the 0.009 level (t = 
3.46; df = 8). Although contrastive analysis correctly predicted that both groups would use 
epenthesis as a strategy, it did not predict that the IP group would use it more often. Yet, 
these results agree with the earlier studies on second language phonology which have shown 
that native language transfer is more apparent in the earlier stages of second language 
development (L. Dickerson, 1974; Major, 1987). 

In summary, (1) the combined rate of simplification revealed a higher rate of 
simplification for the IP group than for the NS group; (2) the dominant strategy for 
simplification for both the HP and IP groups was consonant deletion, and not epenthesis; (3) 
no significant differences were found among the three groups on consonant deletion; (4) the 
IP group used epenthesis more often than the HP group. These results indicate that while 
native language transfer affects the overall rate of simplification and choice of strategy, it 
is limited in the extent to which it can accurately predict under what circumstances, at which 
proficiency level, and with what frequency certain forms will occur. 

Sentence Task versus Spontaneous Task 

Another important factor to be considered in connected speech modifications is style 
shifting. This study compared the sentence and spontaneous data within each group for only 
the linking, deletion and epenthesis categories. The paired t-test was used to determine 
whether the differences between means were statistically significant. 

NS Group. The mean percentages of linking and deletion among the NS group on the 
sentence and spontaneous tasks are presented in the bar graph in Figure 3. Not surprisingly, 
the linking and consonant deletion rates were higher during the spontaneous task than during 
the sentence task. The t-test was significant at the 0.0041 level for linking (t = 5.92, df = 
4) and at the 0.05 level for consonant deletion (t = 2.8, df = 4). 

HP Group. The mean percentage scores for the linking, deletion and epenthesis among 
the HP group for the sentence and spontaneous tasks are presented in the bar graph in Figure 
4. As in the NS group, the linking and consonant deletion occurred at a higher rate during 
the spontaneous task than during the sentence task. The t-test was significant at the 0.002 
level for linking (t = 7.69, df =4) and at the 0.01 level for deletion (t = 4.16, df = 4). 
The t-test was nonsignificant for the epenthesis category. 
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Figure 3. Group Mean Percentage Rates for NS Group for Linking and Deietion on Two Tasks. 





Figure 4. Group Mean Percentage Rates for HP Group for Linking, Deletion, and Epenthesi 

Two Tasks. 





Figure 5. Group Mean Percentage Rates for IP Group for Linking, Deletion, and Epenthesis on Two 

Tasks. 
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IP Group. Figure 5 displays the linking, deletion, and epenthesis behavior among the 
IP group as they read sentences and spoke spontaneously. Although their linking and 
deletion rates were higher during the spontaneous task, only the linking rate was significantly 
higher according to the t-test (t = 3.97, df = 4, p = 0.02). The t-test was non-significant 
for the deletion and epenthesis categories. 

The general finding of significantly higher linking and deletion rates when speech is 
more casual-the spontaneous production-than when speech is more formal-the reading aloud 
of sentences-corroborates the results of other research on second language style shifting (L. 
Dickerson, 1974; Schmidt, 1977; Beebe, 1980; Major, 1987). It is important to note once 
again that the prepared sentences contained a much higher concentration of consonant 
clusters than did the spontaneous speech samples. Had the spontaneous speech samples been 
equivalent to the prepared sentences in the number and types of clusters present, the 
consonant deletion rates for spontaneous speech would probably have been even higher. The 
rates for epenthesis, however, were not significantly different from one task to another for 
either group, indicating that epenthesis is not as sensitive to style shifting as linking and 
deletion are. 

Also of interest in this study were the specific types of modifications that occurred in 
the spontaneous speech samples. In addition to the linking and consonant deletion during 
the spontaneous task, the NS group also used numerous instances of weak syllable deletion, 
a category of simplification not systematically investigated in this study. In some cases, 
syllable deletions were so drastic that the words would be unintelligible apart from their 
context. Examples of such radical modifications are in (27) and (28). The Japanese groups 
only rarely deleted syllables during the spontaneous task. 

(27) [pram] for problem 

(28) [loz] for it was 



CONCLUSIONS 

The purpose of this study was to compare the connected speech modifications of 
intermediate- and high-proficiency Japanese ESL learners and native speaker of American 
English, produced during two different speaking tasks and to assess the effects of language 
proficiency, native language transfer, and style shifting on those modifications. 

The results clearly show that language proficiency is an important factor affecting speech 
modifications. The HP group displayed significantly more modifications in three of the 
categories investigated when compared with the IP group. Also, while the HP group 
approached the NS group in their rate of modifications in many of the categories 
investigated, the IP group, for the most part, showed dramatically lower rates of 
modification. Insofar as one can infer longitudinal development from a cross-sectional study 
such as this, it appears that as Japanese ESL learners achieve higher speaking proficiency, 
they modify speech more frequently in some of the same ways that native spe^ers do. 

The study also shows that native language transfer affects connected speech 
modifications, although the theory is limited in its ability to predict exactly where and how 
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often certain forms will occur. Since the model of native language transfer is static, rather 
than dynamic, it cannot by itself accurately predict the development that occurs in second 
language acquisition. 

Further, the results of this study show that style shifting is as important a factor in the 
connected speech modifications of Japanese ESL learners as it is in native English speech; 
a higher rate of simplification occurred during spontaneous speaking than during sentence 
reading for most of the categories investigated. However, it is important to note that 
Japanese learners did not exhibit the frequent radical restructuring of forms evident in the 
native speakers’ spontaneous speech. 

In conclusion, this study has confirmed that a combination of factors-style, native 
language, and language proficiency-influence the overall rate of modifications in the 
connected speech of Japanese learners of English. Future research in this area must take into 
account all of these factors. The direction of that future research should be to examine the 
aspects of connected speech not addressed in the macroanalysis approach adopted in this 
study, namely, the finer distinctions in connected speech modifications-types of forms in 
each category, their phonetic environment, and their position in the utterance. 
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APPENDIX 
Sentence Reading Test* 

1. A large group of students graduates each spring. 

2. I heard that splendid speech you made last night. 

3. He changed his mind and lunched at the student cafeteria. 

4. They answered correctly, and the instructor thanked them. 

5. I request that all books be removed from the desks. 

6. He will need all his strength to catch the others. 

7. The next time you come we must speak Swahili. 

8. Someone’s trying to turn my friends against me. 

9. Does she like this part of the United States? 

10. George nudged me and asked if we hadn’t watched long enough. 

11. I wonder why that child acts so strangely. 

12. The baby has a big splinter in the skin of his finger. 

13. Thanksgiving comes the last Thursday in November. 

14. Do you expect to catch the next train? 

15. We’ll have to risk using the old screens this year. 



*Sentences from Manual of American English Pronunciation, Fourth Edition by Clifford 
M. Prator and Betty W. Robinett, pp. 187-188. Copyright ® 1985 by Holt, Rinehart & 
Winston. Reprinted with permission of the publisher. 
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IDEAL 7, 1994 



DISCOURSE STRESS AND PHRASAL VERBS 



Wayne B. Dickerson 



The proper placement of discourse stress is the result of an intricate 
orchestration of stress rules at the word, construction, and discourse levels. 
This paper focuses on the interaction of the top two level s-construction and 
discourse-both of which are necessary but neither of which is sufficient by 
itself to position stress meaningfully in sentences. The construction at issue is 
the phrasal verb, also called the two- and three-word verb and the multiword 
verb. This common and highly productive structure, while grammatically 
complex, is the picture of simplicity where stress is concerned. Its 
pronunciation patterns are easy enough to understand and apply that even low- 
level adult ESL/EFL learners can use them-in concert with discourse 
pattems-to improve the accuracy and intelligibility of their speech. 



INTRODUCTION 

Discourse stress is a signpost in English sentences telling the listener what to pay 
attention to. When the stress is well placed, it alerts the listener to the portion of the sentence 
the speaker considers particularly relevant at the moment. When it is misplaced, the stress 
misdirects the listener's attention and may affect the listener's interpretation or comprehension 
of the utterance. For example, referring to the weather and a past picnic, the speaker who 
said, "I THOUGHT it would rain," meant not only that the thought occurred, but also that it 
did rain. Had the speaker said, "I thought it would RAIN," it would have meant not only that 
the speaker had the thought, but also that it did not rain. Well-placed stress in discourse does 
more than make sentences sound nativelike; it is a critical substratum of the speaker's 
message. It says to the listener, 'I want you to focus here. ' 

ESL/EFL learners frequently mislead their listeners and threaten intelligibility by their 
misuse of stress in sentences that contain phrasal verbs. As intractable as this problem may 
seem, it need not persist. This paper identifies not only the nature of the problem but also the 
simple generalizations that can help ESL/EFL learners produce more accurate speech and 
thereby send more accurate signals. 



THE NATURE OF THE PROBLEM 

The misstress of phrasal verbs is serious not only because of the high proportion of 
stress mistakes, but also because phrasal verbs are so common in English that recurrent errors 
affect many sentences in discourse. Interestingly, ultimate responsibility for such poor student 
performance appears to lie with ESL/EFL professionals-textbook writers and teachers-not 
their students. 
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Learners' stress errors in sentences with phrasal verbs are of three kinds. First, 
discourse stress is on a verb head when it should occur on a particle: *I planned to TURN it 
down. Second, stress lands on a particle when it should be on the verb head: *I warned them 
ABOUT it. Third, discourse stress appears on the verb-head or particle-when it should be 
on the object or elsewhere in the sentence: *I planned to TURN down the proposal, so I 
warned my friends ABOUT it. 

An informal survey of students in an advanced pronunciation class suggests the extent 
of such problems. A diagnostic test given to sixty-five international graduate students at the 
University of Illinois (Urbana) at the beginning of the 1992 fall semester contained eight 
instances of two- and three- word verbs: to come along, look at, turn down, conjure up, 
concentrate on, drop off, keep up with, find out about. Half of the students had an error rate 
of 80% or greater. The rate was 60% or greater in three-quarters of the cases. In the 
remaining quarter of the group, the error rate ranged from 20% to 46%. Overall, students 
successfully placed discourse stress in these sentences an average of only 37% of the time. 

International graduate students are not unique in this respect. Christopher Gutknecht 
(1992:263), while noting that his German students generally have few problems with English 
stress and intonation, also says, "The only place where difficulties might arise is in the 
production of phrasal verbs or verbs with post-placed prepositions, where the learner has to 
know which component-that is the verb head or the particle-receives the nucleus [i.e., 
discourse stress]." 

If phrasal verbs were uncommon in English, these stress errors would not be worth 
commenting on. But the high frequency of multiword verbs in ordinary speech makes stress 
accuracy crucial. Celse-Murcia & Freeman (1983:265) observe that phrasal verbs "are such 
an important part of colloquial English that no one can speak or understand conversational or 
informal English with a knowledge of phrasal verbs." Contributing to their frequency is the 
fact that English speakers are coining new ones all the time. Bowen (1975:256) notes that 
”[t]he use of phrasal verbs is perhaps the most productive pattern of lexical creativeness in 
modem English. New combinations are constantly being added to the lexicon." There are 
so many, in fact, that a dictionary of phrasal verbs alone is an impressive volume; Courtney’s 
Longman Dictionary of Phrasal Verbs (1985) runs 734 pages and has over 12,0(K) entries, 
most with multiple meanings.* 

Why don't students do better with phrasal verbs? A large part of the problem is that 
while students learn the grammar of phrasal verbs, their materials do not teach the associated 
stress patterns. A survey of the first forty grammar texts in the ESL library catalogue at the 
University of Illinois revealed that the topic of phrasal verbs occurs in every one of them. Yet 
none of the authors offers a word of help with the stress pattern of these verbs. Even books 
devoted entirely to phrasal verbs (e.g. Burke 1991, Heaton 1965, Hook 1981, Svenconis 
1982) have little if anything to say about the stress of phrasal verbs in discourse context. So 
where can students find stress-pattern information? A likely place is in pronunciation texts. 
Yet an examination of all the pronunciation texts in the University of Illinois ESL library 
(twenty-four in all) showed that only eight discuss phrasal verbs at all, while the remaining 
sixteen have nothing to say on the topic. None mentions three-word verbs. The assumption 
of most authors appears to be that the stress of phrasal verbs, like the stress of polysyllabic 




55 



words, is part of what the student must memorize along with the word. Most discouraging 
is that the others, for all their good intentions, do not describe the stress of phrasal verbs 
accurately enough to help students speak correctly. 

The situation is clear: Students regularly receive detailed structural information on 
phrasal verbs because they are so much a part of everyday English no speaker can escape 
them. But ESL/EFL instruction virtually ignores their pronunciation in discourse with the 
result that attention-distracting stress mistakes blemish students' oral use of phrasal verbs. 



DISCOURSE STRESS 



The Basic Pattern 



In recent years ESL/EFL teachers have come to understand the function of discourse 
stress largely because of their interest in the communicative uses of language. Although there 
is much yet to learn, the broad outlines of the stress system are becoming clear. A brief 
review of its central features will help to frame the issue of phrasal verbs in discourse more 
precisely. The following dialogue from Dickerson (1989, unit 2, p. 16) will serve as an 
illustration. Here, one teacher is discussing a student with another teacher and asks: 



A. 


How are Jean's GRADES, by the way? 


A. 


B. 


In my class, she gets GOOD grades. 


B. 




I mean REALLY good grades. 




A. 


Does she have many FRIENDS? 


A. 


B. 


A FEW friends-a few CLOSE friends. 


B. 



How are Jean’s grades, by the way? 
In my class , she gets good grades. 

A few friends-a few close friends. 



An important function of discourse stress is to highlight new information introduced 
into a conversation. The dialogue above is repeated to the right with new information 
underlined. By conv^tion, we usually put the discourse stress on the last content word in the 
string of new information (not on the last word or the last content word in a sentence). 
Content words are the main nouns, adjectives, verbs, and adverbs as distinct from function 
words, such as prepositions, articles, pronouns, auxiliary verbs, and conjunctions. Therefore, 
in teacher A's first utterance, discourse stress attaches to grades. By the way is a parenthetical 
comment that carries no prominent stress. So in teacher B's first phrase, grades is old. The 
last content word in what is new receives the stress, namely, good. In B's second phrase, 
good and grades are now old; the string of new material ends in the intensifier, really, which 
discourse stress marks. A's next turn advances the conversation with something new, ending 
in the content word friends, which A stresses. In B's last line, containing two phrases, friends 
is old, having been introduced by A's previous question. So in answer, B stresses as the 
last content word in the string of new information. Therefore, by the time B comes to the last 
phrase, and friends are common currency, and discourse stress falls on close ^ as the only 
thing new in that phrase. As this dialogue illustrates, discourse stress plays an important role 
in sentences; it directs the listener to what the speaker considers new additions to the semantic 
substance of the message. A fuller exposition of the discourse stress system can be found in 
Dickerson (1989). 
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As noted, the basic generalization that captures the behavior of discourse stress in the 
dialogue above is that discourse stress falls on the last content word in the string of new 
information. At issue in this paper is the question: Does this generalization apply equally to 
sentences with phrasal verbs? The answer depends on where the phrasal verb is located. 

If the phrasal verb is not at the end of the string of new information, no problems 
arise. On the assumption that the following examples contain new information in all but their 
pronouns, discourse stress falls accurately on the last content word in utterances with two- 
word verbs (first set) and three- word verbs (second set). 

She couldn’t stop talking about the wadding. 

I complimented her on her achievement. 

The police accused him q/'the crime. 

I'm sorry they watered down the repdrt. 

We couldn't make out his signature. 

She soon got over her fear. 

I'm tired of trying to get along with such a grouch. 

He was called away on urgent business. 

Then we went back to our cdbin. 

Let's team him up with Alice. 

Bill apologized to the chairman /or his comments. 

We've got to get down to business. 

If the phrasal verb ends the new information, stress properly falls on the verb. The 
discourse stress rule, however, cannot place the stress on the verb because the rule does not 
know what to do when the new-information verb is in more than one word, as in the next 
examples. The rule fails in part because phrasal verbs are not single content words. They are 
multiword strings that function as a unit; they are constructions like compound nouns, 
compound adjectives, etc. 

She wouldn't stop talking about it. 

I'm sorry they watered it down. 

I'm tired of trying to get along with him. 

Let's team him up with her. 

Therefore, an important repair of the stress rule would be to acknowledge constructions: 
Discourse stress falls on the last content word or construction in the string of new 
information. This accommodation is helpful not only for phrasal verbs but also for strings 
of new information that end in constructions of any kind. 

Implied in the revised rule is the understanding that discourse stress will fall on the last 
content word or construction according the stress pattern of the word or construction. 
That is, discourse stress on a last content word attaches to the syllable of the word ordinarily 
stressed when the word is in isolation. For example, if the word compartment were the last 
content word, discourse stress would be on the second syllable because that is the stress 
position of the word when spoken out of context. Similarly, when the discourse stress rule 
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assigns stress to the last construction, it comes to rest on the part of the construction 
commonly stressed when the construction is spoken alone. 

Two levels of stress are obviously involved. The discourse stress rule, as modified 
above, is adequate for the discourse level. But to use it where phrasal verbs or any other 
constructions are concerned, discourse stress must follow the stress rules of the construction 
level. So, what are the rules governing the stress of phrasal verbs? 



THE STRESS OF PHRASAL VERBS 



Two Central Issues 

In the sentences below, where the final phrasal verbs are new information, the 
discourse stress rule must place stress on each verb construction, as indeed it does. The 
obvious problem is that the position of stress is not always the same from verb to verb. In the 
first three cases, stress is on the verb head, while in the second three, stress is on the first or 
only particle. Part of the construction stress rule for phrasal verbs must address cases like 
these in which the components of a final phrasal verb are adjacent in the new information. 
(Since pronouns refer back to past referents, they are not new information.) 

What are they drguing about? He's going to get awcfy with it. 

I wish I could depend on him. How many did you send out? 

They hid it from him. She never put it 6n, 

Another problem surfaces when the parts of a phrasal verb are separated by other 
sentence components, as in the following utterances. In each, only a portion of the phrasal 
verb-the particle(s)^nds the new-information string. Yet discourse stress falls neither on the 
final part of the verb nor on the verb at all. So another part of the construction stress rule 
must clarify cases in which the phrasal verb has become fragmented and other new- 
information content words follow the verb head. 

We convinced the committee o/it. She took the telephone apartl 

He talked passionately about them. He took his rdge out on her. 

To place discourse stress accurately in the ten examples above, the discourse stress rule 
must rely on a construction stress rule that adequately describes the stress of final phrasal 
verbs having adjacent and non-adjacent components in the new-in formation string. The 
following two sections examine these two cases. 

Head or Particle? 

Grammarians have categorized phrasal verbs in a multitude of ways. Some of their 
distinctions define a two- word verb. For example, a few researchers consider as two- word 
verbs only those that carry idiomatic meaning. Verb + particle combinations that carry literal 
meaning are not two- word verbs. Others treat verb head + adverb as the true two- word verb; 
verb head + preposition is not. Other distinctions dichotomize structural behavior-transitive 
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vs. intransitive, separable vs. inseparable, verbs taking a nominal complement vs. those taking 
a verbal complement, etc. The Lx)ngman dictionary referred to above puts phrasal verbs into 
fourteen structural classes. While these distinctions and categories may be technically useful, 
they do not matter for pronunciation purposes; all phrasal verbs fit the same basic stress rule.^ 

The rule, however, is not the one in cited in the few pronunciation texts that mention 
phrasal verbs. For instance, the statement, ”[t]wo-word verbs are stressed on the second 
word: go awAY, get UP" (Hagen & Grogan, 1992:206), is oversimplified and misleading. 
A couple of texts go a bit further by saying that “[t]here are some frequently used, separable 
verb-preposition combinations in which the stress is on the first element: call for, listen to, 
laugh at, think about" (Handschuh and de Geigel, 1985:12-13). None of these descriptions 
does justice to the facts of two- and three-word verbs. Pronunciation rules must do better than 
this if they are to be helpful to learners. 

While grammar books may slice the verb-construction pie in a host of ways, it is 
useful for stress purposes to think of just three types of phrasal verbs-two kinds of two-word 
verbs and one kind of three-word verb. 



Verb 


-h Stress 


Head 


Particle 


figure 


out 


drop 


off 


take 


over 


look 


back 



Verb 


-Stress 


Head 


Particle 


look 


at 


talk 


about 


dispense 


with 


approve 


of 



Verb 

Head 


+ Stress 
Particle 


-Stress 

Particle 


run 


away 


with 


walk 


out 


on 


talk 


down 


to 


get 


ahead 


of 



In the left-hand column are two-word verbs that in isolation have stress on the particle. 
They have what we call stressable particles (-f stress). (The term "particle" avoids the 
irrelevant distinction between adverbs and prepositions.) In the middle column are words 
often stressed on the verb head. They have stressless particles (- stress). In the last column 
are three-word verbs. They combine the two types of two-word verbs. The first particle is 
always stressable; the second particle is nearly always stressless. 



Since stressless and stressable particles are virtually non-overlapping sets, the type of 
particle that follows the verb head is an excellent guide to the stress behavior of two- and 
three- word verbs. To distinguish a stressless particle from a stressable particle, all that is 
necessary is to recognize the small set of stressless particles. All the rest are stressable. 
Stressless particles are a set of eight: about, at, for, from, of, on, to, with? 

One of these eight-on-is a stressless particle only when the two-word verb falls into 
certain semantic domains. If the meaning of the verb has to do with cognitive or communi- 
cation activities, then on is a stressless particle. Verbs like to agree on, insist on, settle on, 
plan on, decide on, concerurate on, rely on, depend on, count on, bank on, call on, lecture 
on, talk on, speak on, tell on, preach on, comment on, touch on, enlarge on, tell on, dwell 
on, compliment on, etc., have a stressless particle. If the verb has another meaning, such as 
to put on, turn on, then the particle is stressable. 
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In the examples below, the assumption is that the final phrasal verbs are new 
information and should receive the discourse stress. In the first set are two-word verbs with 
stressless particles; discourse stress attaches to the verb head. 



about 


What's she complaining about! 


about 


I nearly /orgdr about it. 


at 


I don't like being laughed at. 


at 


We were all annoyed at him. 


for 


Who did you vdtefo/l 


for 


How much did he pcfy you /or it? 


from 


Where does he c6me from! 


from 


Who were you protecting him from. 


of 


That's what I was thinking of. 


of 


Don't let him convince you o/it. 


on 


Politics is something they could never agree on. 


on 


Please try to cdncentrate on it. 


to 


Who should I apdlogize to! 


to 


Please re'ad it to me. 


with 


Here are some of the people I w6rk with. 


with 


How could you trust him with it? 



Next are examples of two-word verbs with stressable particles. In each case, discourse 
stress is on the particle. There are many more stressable particles than stressless ones. 



around 


They were just standing around. 


away 


The couple wanted to run awcfy. 


back 


When you will come bdcldl 


behind 


We left them behind. 


by 


Don't let this chance slip by. 


up 


Will he ever give up! 


down 


The fever really got him ddwn. 


forward 


The car lurched fdrward. 


off 


Negotiations had broken 6ff, 


out 


Before we knew it, he had passed out. 


in 


When did you move in? 


over 


We tried to talk it dver. 



Finally, three-word verbs behave like two-word verbs, having discourse stress on the 
first of the two particles. 

around I just never got around to it. 

away They couldn't get it awcfy from him. 

back What do you think, now that you look bdck on it? 

forward I'm really looking fdrward to it. 

off How much money did he make dffwithl 

on Well, let's get 6h with it. 

up That's something I need to brush up on. 
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The following dialogue from Dickerson (1989, unit 3, p. 97) provides an opportunity 
to see how phrasal verbs behave in discourse. The focus is on verbs at the end of strings of 
new information. Instances of such verbs are in brackets to the right of each line; each is 
marked appropriately. It is instructive to note how the two types of particles govern discourse 
stress. In the following, the pipe sign ( | ) separates unmarked phrases within a sentence, and 
an asterisk (*) designates a compound noun. 



A clothing clerk talks with a customer. 



A. 


Hello. What can I do for you? 


[66 for] 


B. 


I was looking for some summer slacks. 




A. 


Please feel free to browse around. 


[browse around] 


B. 


Do you do alterations? 




A. 


Yes we do. If it's too large, we can take it in. 


[take in] 




If it's too long, we can cut it off | and hem it up. 


[cut 6ff, hem lip] 




Customer finds some slacks 




B. 


Here's what I was looking for. 


[looking for] 


A. 


Would you like to try them on? 


[try 6n] 


B. 


Yes, I would. 




A. 


The ^dressing room is free. 






If I can help, please ask for me. 


[dsk for] 




- Customer returns in a few minutes 




A. 


How do they fit? 




B. 


I don't think I care for them. 


[c^e for] 


A. 


Perhaps a different pair? 




B. 


No. I don't think I feel up to it. 


[feel lip to] 


A. 


That's all right. Thank you for coming in. 


[coming in] 




Please hurry back. 


[hurry bick] 



One part of the phrasal verb stress pattern is now clear, namely, the importance of the 
particle when verb construction components are contiguous at the end of a new-information 
string. Fortunately, the distinction between stressless and stressable particles is simple and 
its application straightforward. 

Verb or Content Word? 

Another part of the stress pattern concerns cases in which a content word of some kind 
comes between the verb head and its particle. Even though the particle belongs to the verb 
and is the last piece in the string of new information, the particle does not receive discourse 
stress if the preceding content word (or construction) is also new information. It does not 
matter whether the particle is stressless or stressable; it behaves like any other function word 
in not attracting discourse stress. Instead, the last new content word wins the stress. In the 
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following examples, stressless particles are in the left column; stressable particles are in the 
right. 



We tricked Nincy with it. They can work the details out. 

He looked intently at her. Then she tried a coat on. 

Please thank Anne for me. We couldn't talk Bfll into it. 

He told the police about her. Will he let his family in on it? 

The second part of the rule for phrasal verbs is now clear, also. When the verb head 
and its particle are separated by a new-information content word or construction, stress 
placement conforms to the discourse stress rule, namely, stress falls on the last content word 
or construction, disregarding the particle. 



DISCOURSE STRESS AND NEW INFORMATION 

The only reason a construction stress rule comes into play when assigning discourse 
stress is that the construction is part of the new information in the phrase. If a phrasal verb 
is new information, then the phrasal-verb stress rule can help position discourse stress 
properly. The preceding discussion has argued for a rule with two parts. 

First, when a new-information content word or construction follows the verb head, the 
final particle has the status of a function word, allowing stress to fall on the last new- 
information content word or construction. Second, when no new-information content words 
or constructions follow the verb head, construction stress depends on the distinction between 
stressless and stressable particles. These two points make it possible to assign discourse stress 
to any sentence with a phrasal verb according to the stress pattern of the verb construction. 



The following dialogue based on Dickerson (1989, unit 3, p. 98) illustrates the stress 
behavior of phrasal verbs, particularly the observation that particles separated from the verb 
head by content words or constructions have no special status. In this example, the pipe 
symbol (|) marks off phrases within sentences; the asterisk (*) identifies compound nouns; 
NI is a stressed new-information content word or construction; and ni and oi are unstressed 
content words or constructions, new and old, respectively. 



A pair of mechanics help a customer change a tire on a new car. 



A. 

B. 



A. 

B. 

A. 

B. 



The procedure is simple | if you think about the steps. 

First, put the ^emergency brake on. 

Then open up the trunk | 
and take the spare tire out. 

You'll see the jack and *lug wrench fastened to the wheel. 

Next, loosen the *wheel lugs | before jacking the *car body up. 
Now, raise the tire off the ground, and finish unscrewing the lugs. 
Next, take the wheel off of the axle, and 
put on the new one. But just handtighten the lugs. 



[think about NI] 
[put NI on] 
[open up NI] 
[take NI out] 
[fastened to NI] 
[jack NI up] 
[raise oi off NI] 
[take oi off of NI] 
[put on NI] 



O 



A. Finally, lower the car, and use the wrench to 
really tighten down on the bolts. 



[tighten down on NI] 
[wrap NI up] 
[put away] 



B. Now, you can wrap your tools up | 
and put them away. 



DISCOURSE STRESS AND OLD INFORMATION 



The foregoing discussion has emphasized the importance of new information. 
Sometimes, however, final phrasal verbs or the content words and constructions after the verb 
head are not new information. In the first dialogue above, new information becomes old when 
it is repeated verbatim (bolded): How are Jean's GRADES? She gets GOOD grades, 
REALLY good grades. But often enough, speakers avoid exact repetitions, preferring instead 
a synonym for the old content or a paraphrase of it. When this happens, discourse stress still 
eschews the old information and looks for new information elsewhere. 

The next dialogue illustrates the use of synonyms for verbs originally introduced as 
new information, namely, accompany and escape. The synonyms in this example are phrasal 
verbs. Nevertheless, as old information, they are ineligible for discourse stress which falls 
somewhere else in the sentence. (The assignment of discourse stress to some of these sentences 
goes beyond the basic generalizations of this paper and will not be discussed here.) 

Police question a relative of an jail escapee. 

A. We're here about your brother. 

We want to know who accompanied him | 
when he escaped. Was it Sarah? 

B. She might have gone with him. [synonym of accompany^ 

Why not Isk her if she tagged along? [synonym of accompany] 

A. Did you h6\p him get away? [synonym of escape] 

B. I wasn’t even in town when he took off. [synonym of escape] 

An example of paraphrased old information is in the following dialogue where the 
event of running out of gas is referred to in a variety of ways (Dickerson 1989, unit 3, p. 98). 
The paraphrases-rAflr day, such a situation, the problem, an adventure like this, the 
experience-dirt, content words that follow the verb head of phrasal verbs. Even so, they are 
not candidates for discourse stress because, while the words are new, they refer to old 
information. By contrast, the phrasal verbs are new information and, being the last content 
words in those new-information strings, they capture discourse stress. Again the pipe sign ( |) 
separates unmarked phrases; the asterisk (*) identifies compound nouns; NI is a stressed new- 
information content word or construction; and the abbreviation oi stands for an old- 
information content word or construction that is unstressed. 

An unhappy parent talks with the camp bus driver. 

A. I heard you ran out of gas | on the way to camp last week. [ran out of NI] 

B. Don't remind me of that day! [remind of oi] 
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A. Apparently you weren't prepared for such a situation. 

B. Not really. But it certainly livened up our afternoon. 

A. So who did people blame for the problem? 

B. The parents blamed me, of course. But the kids were hoping 
for an adventure like this. They talked about the experience | 
for the rest of the trip. 

A. Are you sure you're cut out for *bus driving? 

B. At least I can put up with it. 



[prepared for oi] 
[livened up NI] 
[bMme for oi] 
[hdping for oi] 
[tdlked about oi] 

[cut out for oi] 
[put lip with] 



These dialogues illustrate the intimate interaction of discourse stress and construction 
stress. Clearly the construction stress pattern of phrasal verbs does not carry the day. If it 
did, all of the verbal synonyms and all of the paraphrases after the verb heads would sport 
discourse stress. The discourse-level determination of new and old information, discovered 
only from context, is the essential first step to the placement of discourse stress. Yet the 
discourse stress rule is helpless to position stress correctly without construction stress 
information. Both levels of stress are interdependent and equally responsible for correct stress 
placement. 



THE FORMAL RULES 

Stated formally, the following rules capture the essence and interaction of the two 
levels of stress assignment. The examples use two- and three-word verbs, stressless and 
stressable particles, content words and constructions, and new and old information in various 
positions in an effort to show the diversity of circumstances these rules control. 



Discourse stress: 

Discourse stress falls on the last 
content word or construction in the 
string of new information/ 

Construction stress of phrasal verbs: 

1. When a new-information content 
word or construction follows the verb 
head, apply discourse stress. 

2. When no new-information content 
word or construction follows the verb 
head, place stress on a stressable 
particle; otherwise place stress on the 
verb head. 



Examples 

[This dialogue between a husband and wife 
repeats old information in synonyms for disap- 
proving {object to) and eating at {get me down) 
and in verbatim repetition {viewpoint). Lines 
are numbered for reference.] 



A. I can't put up with your silence. 1 

You picked up the newspaper | and 2 
looked disapprovingly nt it. 3 

What's ftgtino nt you? 4 

B. Oh, Al's editdrials get me down, 5 

A. What about them? 

B. It's his viewpoint I object to, 6 

A. But he's entitled to a viewpoint. 7 

B. I know. But... 



The two rules encompass so many possible situations in discourse that it may be helpful 
as a summary to highlight what the rules do in the example dialogue. First, the discourse 
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stress rule does not position stress in a phrase without taking into account the stress 
conventions of the content words and constructions involved. Second, when a phrasal verb is 
part of the new information, the stress-assignment process engages the rule governing phrasal- 
verb stress. Here is what the rule does. 

Case 1 of the phrasal-verb stress rule applies to a new-information content word [1] 
or construction [2] anywhere after the verb head. If it is immediately after the verb head [3], 
the following particle is treated as a function word. But it may follow the entire verb [1,2]. 
In both situations, stress goes on the last new-information content word or construction. Case 
2 accounts for instances in which neither a content word nor a construction occurs after the 
verb head [4] and for instances in which a content word or a construction occurs after the verb 
head but is not new information [7]. In both situations, the rule user must be able to 
distinguish stressless from stressable particles. By noting the stress of stressable particles, case 
2 simultaneously accounts for two- and three-word verbs with stressable particles; the 
otherwise case accommodates verbs with stressless particles. 

Finally, when the phrasal verb is not part of the new information at all [5,6], the verb 
construction rule does not figure in stress assignment. The discourse stress rule places stress 
on the last new-information content word [5] or construction [6] according to their stress 
patterns. 



CONCLUSION 

The stress of sentences with phrasal verbs is highly regular and basically simple. Since 
the rules were designed for anyone who can distinguish content words from function words, 
learners need no longer memorize or guess at the stress of the phrasal verbs they encounter; 
they can predict that stress and do so with striking accuracy. An indication of this is the 
performance of those sixty-five University of Illinois students mentioned above. By the end 
of their pronunciation course, the same students whose pretest success rate at the beginning 
averaged only 37% had an average posttest success rate of 84%. 

Improvement such this should encourage writers of grammar and pronunciation 
materials to break with their long tradition and fill the void in their texts with guidance about 
the stress system of phrasal verbs. Only then will their students' oral performance convey the 
discourse signals that listeners expect and need in order to process speech successfully. 
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NOTES 

^This is not the only collection of phrasal verbs, though one of the most ample. See 
also Heaton (1965) and Svenconis (1982) each with more than 3(XX) entries. 

^One of the first principles of pedagogical rule writing is to observe the No-Prior- 
Knowledge Assumption (Dickerson 1985), namely, the assumption that the learner does not 
already know the language and does not have native-speaker resources. Any violation of this 
assumption imposes an unfair burden on learners and guarantees their flawed performance. 
Unless learners can be trained to make accurate judgements as to whether the meaning of a 
verb is literal or idiomatic, or whether a particle is an adverb or a preposition, or whether a 
phrasal verb can take an object after the particle or not, these prerequisites for rule use qualify 
as violations. 

^This group of eight does not correspond precisely with what are often categorized as 
prepositions. Some British authors suggest that among the prepositions are also after, into, 
off, over, through, without. But in all sentences I have examined where these words are in 
final, new-in formation phrasal verbs, stress is on the particle, not the verb head, e.g. It*s 
silly to run ifter them. Is this what I should go 6ver? Why do without it? 

‘^Illustrated in the next-to-the-last line of the dialogue-in the construction bus driving-\^ 
a paraphrase not of a particular word but of a shared understanding, namely, that the parties 
are talking about speaker A's recent bus driving. 

^This paper has focused on only one important function of discourse stress, namely, 
to highlight new information. Another of its functions is to highlight explicit contrasts for the 
listener. In this role, the phrasal verb rules play no part. In fact, contrasts may even violate 
construction stress rules. For example, 7 was talking about him, not talking td him. Or, He 
said he*d rather wdlk out of prison than break out. 
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THE STRESS OF COMPOUND NOUNS: 

LINGUISTIC CONSIDERATIONS AND PEDAGOGICAL IMPLICATIONS 



Laura D. Hahn 



The stress of compound nouns contributes a unique prosodic 
pattern to the rhythm of English. This pattern functions 
systematically in terms such as flower pot and graveyard. However, 
the boundaries delineating compound nouns are far from clear. 
Furthermore, the stress pattern is not applied with obvious 
regularity; for example, we find a lunch box but a box liinch. This 
paper will investigate some of the difficulties involved in defining 
and stressing compound nouns, and discuss implications for ESL 
pronunciation teaching. 



BACKGROUND 

It is now widely recognized that prosodic features contribute significantly 
to the meaning of an English utterance (Avery & Ehrlich, 1992; Morley, 1991; 
Pennington & Richards, 1986; Brazil, Coulthard & Johns, 1980; Stevens, 1989, 
among others). Because proper stress, rhythm, and melody directly affect 
intelligibility, they are often the target of instruction and practice in ESL 
pronunciation courses. 



However, many aspects of the English prosodic system are so complex, so 
inadequately defined, and/or so ineffectively adapted for classroom materials 
that instructors are often hard pressed to present satisfactory lessons on them. 
This is undoubtedly the case with compound constructions. As Taylor (1991, p. 
67) states. It is notoriously difficult to know how to stress English compound 
words." 

Roughly speaking, compound constructions are combinations of two or 
more words which work together to form a single unit of meaning. They can 
function as: 

nouns: culture shock, driving test, social life 
adjectives: good-looking, heartfelt 
verbs: to footnote, to ice skate 
adverbs: upstairs, inside out 

An investigation of compound nouns alone reveals the broad scope of issues 
related to defining and teaching compound constructions. 
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Compound nouns can be described on three levels: syntactic, semantic, 
and phonological; on each level, they are often compared to noun phrases. On 
the syntactic level, the elements of a noun phrase constitute separate syntactic 
elements, while the elements of compound nouns make up one single syntactic 
element: 



Noun Phrase (adj. + noun) 
soft powder 
a different dress 
social interaction 
cold beer 



Compound Noun (one noun) 
curry powder 
a party dress 
social life 
cold cream 



Compound nouns can also be analyzed in terms of the parts of speech of the 
component elements. The most common combination by far is Noun + Noun, 
though many others exist. Following are some examples: 

Noun + Noun: area code, folksong 
Noun's Noun: Adam's apple, bull's eye 
Adjective + Noun: middle man, cold front 

Noun + Verb or Verb form: jaywalking, stargazing, sunrise, nightfall^ 

Verb + Noun: grindstone, pickpocket 
Verb + Preposition: a rip-off, a hangover 

Stageberg provides two criteria which distinguish compound nouns from 
similar grammatical constructions. First, compound nouns 'cannot be divided 
by the insertion of intervening material," (1971, p. 109). For example, soft white 
powder is a possible grammatical utterance; therefore soft powder is not a 
compound noun. But curry red powder is not grammatical, so curry powder is a 
compound noun. The second criterion is that one element of a compound noun 
"cannot participate in a grammatical construction," (p. 110). For example, in, "I 
need a completely different dress," different can participate with completely. 
However, in, "I need a completely party dress," party cannot participate with 
completely. Stageberg notes that this test is useful in potentially ambiguous 
statements. For example, "I bought some very cold cream" can only occur when 
cream refers to the dairy product. In this case, very and cold can work together. 
However, the same sentence is not grammatical if cold cream reiers to the make- 
up remover; very cannot participate with cold in this case, indicating that cold 
belongs instead with cream as a compound noun. A similar test is that the first 
element of compound nouns cannot be inflected: colder cream is not possible 
when cold cream is the make-up remover. 

On the semantic level, compound nouns can be examined in terms of the 
functional relationship of the elements, in which each "contribute[s] to the 
meaning of the whole lexeme (Poldauf, 1984, p. 121)." Quirk et al. (1975, p. 444) 
give the following examples: 

"playboy" means the boy plays, i.e. verb + subject 
"call-girl" means X calls the girl, i.e. verb + object 
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To compare compound nouns with noun phrases on this level, many 
authors [e.g. Quirk et al. (ibid.)/ Kreidler (ibid.)] give examples such as French 
teacher vs. French teacher. Here, the first item, a compound noun, describes a 
teacher who teaches the French language; the second item, a noun phrase, 
describes a teacher who is of French origin. 

Compound nouns can also be compared with noun phrases on the 
phonological level. Most English phonology manuals and ESL textbooks make 
the generalization that compound nouns are characterized by stress on the first 
(or penultimate) element of the compound (compound stress). In contrast, noun 
phrases carry more stress on the final element (noun phrase stress). Along the 
lines of the French teacher examples above, stress is assigned to these 
combinations as follows: 

Noun Phrase Compound Noun 

dancing teacher dancing teacher 

blue bird blue bird 

Most compound nouns fit simply into such syntactic, semantic, and 
phonological parameters. When students master the ability to produce the 
compound stress pattern and apply it to the most common syntactic pattern. 
Noun + Noun, they will accurately stress the vast majority of English compound 
nouns. Therefore, this is a teachable pattern which can help students improve 
the rhythm of their spoken English. 

However, the system is actually much more complex. Pronunciation 
teachers must take into account the following factors which complicate the 
teaching of compound nouns: 

a. the unreliable relationships between the syntactic/ semantic patterns 
and the stress patterns, 

b. the generative nature of compound nouns, 

c. the variability in stress of some compound nouns, 

d. the effect of discoursal context on compound noun stress. 

Unreliable Relationships 

It is not always immediately clear whether word combinations constitute 
noun phrases or compound nouns, because neither the syntactic, nor the 
semantic, nor the phonological description is entirely adequate for differentiating 
them. Learners need to know when to apply the compound stress pattern, but 
they cannot use surface-level syntactic clues reliably. This is because there is no 
clear one-to-one correspondence between the stress pattern and 
syntactic /semantic patterns. For example, there are compound nouns which on 
the surface look syntactically like noun phrases, e.g. temperate zone, dining hall, 
social skills, but take compound noun stress. In these cases, learners would 
erroneously apply noun phrase stress where compound stress is required. Tests 
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such as those proposed by Stageberg are helpful to ESL instructors for such items, 
but not for students, who cannot be expected to use meaning as a guide to 
determine the status of a newly encountered term. 

Furthermore, while most compound nouns consist of Noun + Noun, 
there are many which do not take compound stress, e.g. star player, cherry pie, 
computer science. In such cases, learners may overgeneralize the compound 
stress pattern and apply it erroneously to such constructions. Here, Stageberg's 
syntactic criteria help to the extent that they identify such constructions as 
compound nouns, but they do not help determine the stress. Part of the 
difficulty, then, is distinguishing which stress principles apply to which 
construction types. 

Fudge (1984, p. 136) addresses such problems. He says, "There is surely no 
syntactic reason for saying that Christmas cake is a compound whereas Christmas 
pudding and Christmas pie are straightforward noun phrases, and yet the stress- 
patterns are totally distinct." He goes on to conclude that "we have little 
alternative but to recognise a second type of compound whose nuclear stress falls 
on the compound final, and whose stress pattern is therefore identical with the 
phrasal pattern." 

Kreidler (1989, p. 222) takes a slightly different approach. He states that 
"there isn't always a clear criterion for deciding what is a compound and what is 
a phrase, except for knowing what the stress pattern is. We cannot say that 
compounds have one kind of meaning and phrases have another." 
Unfortunately the stress pattern is precisely what students do not know. 

Furthermore, Taylor (1991, pp. 69 - 70) cites examples of Noun + Noun 
terms which are sometimes stressed as compound nouns, and sometimes 
stressed as noun phrases. 

Poverty seems to be related to family size. 

When buying washing powder I always buy the family size. 

He explains that, "In the first case, we are talking about the size of a family, and 
in the second case, about the size for a family." He goes on to suggest that "the 
grammatical relationship between the elements" affects the status of the word 
combinations. 

In an attempt to sort out the syntactic and semantic relationships 
involved, Poldauf (ibid.) suggests something of a continuum of compound 
noun-type constructions. He suggests that "composite lexemes" are often formed 
by syntactic procedures, and therefore are stressed on the final element. He 
provides over 30 semantically defined categories of composite lexemes, such as 
"attributes referring to a material," (p. 112), e. g. milk chocolate, and 
"nomenclature referring to plants," (p. Ill), e. g. globe artichoke. He classifies 
compound nouns not as syntactically derived, but as derived "by special 
procedures" (p. 106). He includes over 30 categories of compound nouns, which 
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take compound stress. However, the sheer number of categories, subcategories, 
and exceptions prevents the pronunciation instructor from gleaning much 
practical pedagogical help from this type of analysis. 

Hence the qualifications for compound noun status are not entirely clear. 
Fudge suggests that semantic and syntactic criteria are sufficient, even though 
compound noun stress is not used. On the other hand, Kreidler seems to 
assume that the stress pattern is the crucial criterion. 2 While Poldauf's 
correlations of stress patterns with specific syntactic and semantic categories 
represent a helpful approach, they are lacking in practicality. Part of the 
challenge, then, is identifying criteria which are sufficiently reliable to provide 
students with more accurate, comprehensive access to the stress system. 

Generativity 

Another factor to consider is the generative nature of compound nouns. 
There is not a contained set of compound nouns to be learned; rather, native 
speakers regularly create novel compound combinations. One such example is 
mood flurries, a novel term used by a television broadcaster to refer to a light 
snowfall predicted for Christmas Eve (Allison Payne, WGN News, December 23, 
1992). There is little doubt that all native speakers would use compound stress 
on this term; therefore native speakers do have some internalized awareness of 
compound noun stress. The terms voice mail, laser printer, and health food are 
examples of relatively recent compound noun creations (Safire, 1992). 
Compound noun coinages are also common in everyday speech, although most 
are less vivid. One can easily imagine: "He’s going to use a Chomsky quote for 
t e essay Item on the linguistics test, although these terms as compound nouns 
are not part of the standard lexicon. Thus it is important to be aware of the 
prevalence and ongoing creation of compound nouns in the language. 



The extent to which a given compound noun is stressed with compound 
stress could be related to the degree to which the combination is "felt” as a 
cohesive unit by the speaker. Taylor (ibid., p. 72) suggests that "the more fusion 
there is in the eyes of the speaker, the more likely it is that a compound will 
behave phonologically like a single word and have single [i.e. compound noun] 
stress. The speaker’s perspective, then, could influence the stress of newly 
constructed terms. ^ 

Variability 

Another complication in the analysis and teaching of the pronunciation of 
compound nouns involves the variability with which stress is placed on some of 
them. The variability may come from different sources. First, there is regional 
and dialectal variability (Kreidler, ibid.) — for example, ice cream, green beans, 
cream cheese. Furthermore, Poldauf (ibid.) suggests that American English tends 
to use more compound noun stress than British English does. 
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Second, some word combinations which are relatively new may be in the 
process of becoming compounds. It is believed that many compound nouns 
evolve from separate syntactic elements with noun phrase stress into a cohesive 
unit with compound noun stress. Taylor (ibid.) cites revealing examples given 
by Kingdon in 1958. At that time, the following combinations were stressed as 
noun phrases; they now regularly take compound noun stress: box office, 

vacuum cleaner, traveller's cheque, stage manager. Thus the evolutionary 
nature of compound nouns could explain the current stress variability found in 
case study, a relatively new term. 

Third, certain aspects of a speaker’s intent contribute other types of 
variability. Bolinger (1986) suggests that stress can be placed on the last word of a 
phrase in order to create a climactic impact. When a compound noun falls in 
this position, the stress may shift to the last element. News announcers and 
other public speakers sometimes do this for emphasis. In the following example 
-- a sentence which ended that news item — sports announcer Dan Roan (WGN 
News, December 23, 1992) applies noun phrase stress to the final term, which 
usually takes compound stress: 

Five-year vets are unrestricted free agents, though teams can protect one 

franchise pldyer. 

But Bolinger acknowledges that this phenomenon could also be an effect of 
reading aloud, wherein some speakers do not attend to meaning, and therefore 
fail to regard the compound noun as a cohesive unit. 

Context 

Another complication involves the issue of the larger context of the 
utterance. For example, love story and love letter are in and of themselves 
compound nouns with compound noun stress. The stress changes when they 
occur together: 

He wrote her a I6ve stbry. He also wrote her a Ibve letter. 

The principle here is that repeated information (here, the second love) is not 
stressed, but contrasting information is. This contrast causes the stress to shift to 
letter and the compound stress to be overridden. Sometimes, this principle 
causes changes in the stress of more than one compound noun; compare: 

I wanted meatballs, not Alfredo sauce. 

I wanted meatbdlls, not meat sauce. 

Therefore, contrasts and other contextual factors can affect the stress of 
compound nouns. 

In sum, most compound nouns are readily recognizable, and therefore the 
stress is readily teachable: Noun + Noun. However, this pattern alone is far too 
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generic. Other factors — variability, context, and various subcategories of syntactic 
and stress-related patterns (with their exceptions) -- all contribute additional 
complexities. 



Most ESL pronunciation textbooks have dealt with compound nouns, but 
in varying degrees of detail. Some authors give only brief mention to them. 
Clear Speech (Gilbert, 1984) mentions compound nouns only in the teacher’s 
guide (p. 20), yet uses compound nouns as practice items in the student text. For 
example, Gilbert’s lesson on the intonation of thought groups contrasts the 
following two utterances without mentioning the role of compound noun stress: 

la. He sold his houseboat and trailer, 
b. He sold his house, boat, and trailer (p. 6). 

The Manual of American English Pronunciation (Prator and Robinett, 
1985) provides just one paragraph on compound nouns, with four examples. 
Improving Spoken English (Morley, 1979) says little more. Morley briefly defines 
compound nouns as one of several sets of words which are stressed on ’’the first 
syllable (p. 14).” She does not address compound nouns in which the stressed 
syllable is not first, e.g. communication gap, defense mechanism; nor does she 
address syntactic variables. 

Most authors, however, define compound nouns more fully. Sounds and 
Rhythm, A Pronunciation Course (Sheeler & Markley, 1991) defines compound 
nouns as two words "used as a single noun,” with "strong stress on the first 
word, (p. 25).” They provide several kinds of perception and production 
exercises, as well as periodic recycling of compound nouns throughout the rest of 
the text. Chan’s treatment (in Phrase by Phrase, 1987) is along the same lines. 

Accurate English (Dauer, 1993) also addresses compound nouns similarly. 
The author’s emphasis is on the stress differences between compound nouns 
and noun phrases. Therefore her exercises focus on such contrasts as baseball 
player vs, famous player (p. Ill) and cheapskates vs. cheap skates (p. 112). 

Say it Clearly (English, 1988), Communicate (Smith et al., 1991), and Sound 
Advantage (Hagen & Grogan, 1992), define compound nouns more narrowly: as 
consisting of two or more nouns with stress on the first noun. Interestingly, they 
use non-nouns in their examples: blackboard, greenhouse, rewind button, 

freeway. These three texts have comparatively sparse practice exercises. 

A unique feature of Communicate, however, is the treatment of 
compound nouns with three elements, e.g. Consumer Price Index, weight loss 
program. They suggest that there is often a two-element compound noun 
embedded in the three-part one, and that the stress stays on the first noun of the 
two-element compound. 
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Relatively few textbooks address the complexities of compound nouns 
much further. While Handschuh and de Geigel (Improving Oral 
Communication, 1985) define compound nouns much as other authors do, they 
also treat compound proper nouns. They explain that these are usually stressed 
on the second (or last) element, e.g. Easter Sunday, Lake Michigan, They give a 
sizeable list of examples, and discuss some of the exceptions. 

Dickerson’s Stress in the Speech Stream (1989) provides perhaps the most 
comprehensive treatment of compound nouns in a pronunciation textbook. 
While his definition is much the same as that of other authors, he acknowledges 
that ”we cannot apply (compound noun stress) with our customary precision (p. 
15)," and that, in some cases, "...if you do not know the meaning of the 
construction, you will not be able to predict the stress with confidence (p. 14)." 

Dickerson treats the following Noun + Noun constructions which do not 
take compound noun stress, but instead are stressed on the last element: 

1. Names of people and places: Bill Rogers, New York City 

2. Names of publications: London Times, National Geographic 

3. Titles: Assistant Professor, Surgeon General 

4. Initials: UFO, Ph,D, 

5. Proper names in the first noun: Colt forty-five. Concord grapes 

He also treats the following combinations which do not consist of the 
more obvious, reliable Noun + Noun, but which take compound stress: 

1. Adverbs and neutral prefixes in compound nouns: hideaway, downfall 

2. -ing -I- Noun: printing press, magnifying glass^ 

PEDAGOGICAL IMPLICATIONS 

Again, despite the complexities, there are enough compound nouns to 
warrant the time and effort spent on their unique prosodic patterns, and 
obviously the most common and straightforward — and prolific — category is 
Noun -I- Noun. However, how much more of the system can a pronunciation 
instructor reasonably expect students to master? Following are some 
recommendations: 

1. Students need to know when they can use compound noun stress with 
confidence. In addition to some of the systematic exceptions discussed below, 
another extremely reliable clue to compound noun stress is the way the word is 
written. That is, almost all word combinations which are written as one word 
(e.g. background, toothbrush) and words which are hyphenated (e.g. looking- 
glass, story-telling), take compound noun stress (Dickerson, 1989). These 
combinations would include those with letters as nouns, e.g. e-mail, v-neck, t- 
test. 
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2. Let students know that the system is complex. If students are made aware that 
there are some regularities behind some of the apparent contradictions, they can 
begin to gain more of a sense of control over the system. These regularities 
include the effect of contextual factors, and some of the various systematic 
exceptions. This strategy may make other inconsistencies, such as variability and 
other less “teachable" exceptions, less overwhelming. 

3. Address some of the major subsystems explicitly. Students can learn to 
recognize sets of words which take compound noun stress but look like noun 
phrases, and sets of words which take noun phrase stress but look like 
compounds. This will enable them to stress a much greater number of such 
constructions with accuracy. Dickerson's categories, outlined above, provide a 
strong base of the most clear-cut, easily manageable, and generative subsystems, 

4. Address minor subsystems based on students' needs. It may not be necessary 
for dance majors to learn the stress of chemical compounds. Botanists, on the 
other hand, may wish to learn the stress patterns involved in plant 
nomenclature. Instructors must determine what types of constructions, and the 
level of detail, their students need. 

In addition to Dickerson's categories, the following subsystems may prove 
to be useful. Even if these are not explicitly taught, instructors still may find 
such subcategories helpful as potential explanations of terms they may 
encounter. 

Noun + Noun: Noun phrase stress 

a. N1 is a material out of which N2 is made (Fudge, ibid.). 

Examples: silver dollar, peach pie, glass slipper 

Exceptions: banana bread, ginger smp, N1 + juice, e.g. orange juice, apple 
juice 

b. N1 is the time N2 takes place, or is due to appear: 

Examples: afternoon nap, 8:00 class, morning dew, holiday traffic 
Exceptions: night classes, day shift 

c. N1 refers to a person or group of persons. 

Examples: child actor, group therapy, male stripper, team spirit 
Exceptions: faculty meeting, peer pressure 

d. N2 consists of three or more syllables. While this subsystem is less consistent, 
it may explain some occurrences. Poldauf (ibid.) suggests that different 
combinations in the number of syllables and in word stress of each individual 
element sometimes play a role in effecting different stress patterns. 

Examples: minority government, situation comedy, transistor radio 

Exceptions: conservation policy, correlation coefficient 
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e. N1 is computer. These relatively new terms may be in transition. 

Examples: computer programmer, computer science, computer graphics 

Exceptions: computer chip, computer disk 

In terms of compound nouns consisting of Adjective + Noun, one clue may be 
the meaning of the noun. Bolinger (ibid.) suggests that if the noun is empty or 
generic in meaning, relative to the adjective, stress may stay on the adjective. 
This clue, while useful to instructors, may not be viable for learners who are not 
advanced enough to rely heavily on interpreting the meaning of a term. This 
phenomenon seems particularly common in adjectives ending in -al. 

-al examples: social studies, clerical help, vital signs, personal life, dental 

appointment 

Other examples: nervous system, the good guys, little folk 

5. Continue to collect compound nouns, look for teachable regularities. This is 
an ideal opportunity for teachers to become learners. When we are 
conscientiously involved in analyzing our own language, and creating useful 
materials based on our findings, we -- and our students -- benefit. Furthermore, 
when we share this examination process with our students, they in turn are 
encouraged to become their own teachers; to look for and analyze word or stress 
patterns they encounter. 

It may be unrealistic to expect that further research into the syntactic, 
semantic, and stress patterns involved in compound nouns will result in a 
simplification of the system for pedagogical purposes. Yet because compound 
nouns are such a vital, generative part of the English language, their unique 
prosodic patterns cannot be ignored in the pronunciation classroom. The 
principles and patterns proposed, however, can serve as guidelines for 
instructors to consider when teaching compound nouns. 
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NOTES 

is actually difficult to determine the part of speech of some of the 
elements. For example, /«// can be either a noun or a verb. Poldauf (1984) seems 
to suggest that the part of speech is determined by the underlying functional 
relationship of the word to the whole compound. For example, in nightfall, 
night is falling; hence fall is a verb. Similarly, pick is a verb in toothpick because 
it refers to an action done to the tooth (p. 122). 

^Kreidler himself seems to contradict this definition. At one point, he 
does acknowledge that "some compound nouns have stress on each word of the 
compound," (p. 220). 

^Poldauf (ibid.) points out that combinations containing the -ing form in 
the first element only take compound noun stress if the semantic relation 
implies "intention or purposive causation (p. 118)." Therefore, a bathing suit is 
intended for bathing, and the purpose of a teaching assistant is to teach. He goes 
on to list other types of -ing collocations which do not fit this description, such as 
living memory, founding fathers, fighting chance. 



REFERENCES 

Avery, P., & Ehrlich, S. (1992). Teaching American English pronunciation. 
Oxford: Oxford University Press. 

Bolinger, D. (1986). Intonation and its parts. Stanford: Stanford University 

Press. 

Brazil, D., Coulthard, M., & Johns, C. (1980). Discourse intonation and language 
teaching. London: Longman. 

Chan, M. (1987). Phrase by phrase: Pronunciation and listening in American 

English. New Jersey: Regents-Prentice Hall. 

Dauer, R. (1993). Accurate English: A complete course in pronunciation. New 
Jersey: Regents-Prentice Hall. 

Dickerson, W. (1989). Stress in the speech stream: The rhythm of spoken 

English. Student Text. Urbana, IL: University of Illinois Press. 

English, S. (1988). Say it clearly: Exercises and activities for pronunciation and 
oral communication. New York: Collier MacMillan. 

Fudge, E. (1984). English word-stress. London: George Allen and Unwin. 




80 



78 



Gilbert, J. (1984). Clear speech: Pronunciation and listening comprehension in 

American English. Cambridge: Cambridge University Press. 

Hagen, S. & Grogan, P. (1992). Sound advantage: A pronunciation textbook. 

New Jersey: Regents-Prentice Hall. 

Handschuh, J. & de Geigel, A. (1985). Improving oral communication: A 

pronunciation oral-communication manual. New Jersey: Regents- 

Prentice Hall. 

Kreidler, C. (1989). The pronunciation of English: A course book in phonology. 
Oxford: Basic Blackwell Ltd. 

Morley, J. (1979). Improving spoken English: Intensive personalized program 

in perception, pronunciation, practice in context. Ann Arbor: University 

of Michigan Press. 

• (1991). The pronunciation component in teaching English to 

speakers of other languages. TESOL Quarterly 20(2), 207 - 225. 

Pennington, M. & Richards, J. (1986). Pronunciation revisited. TESOL 
Quareterly 20(2), 207 - 225. 

Poldauf, I. (1984). English word stress: A theory of word-stress patterns in 

English. Oxford: Pergamon Press. 

Prator, C. & Robinett, B. (1985). Manual of American English pronunciation. 
New York: Holt, Rinehart and Winston. 

Quirk, R. & Greenbaum, S. (1975). A concise grammar of contemporary English. 
New York: Harcourt Brace Jovanovich, Inc. 

Safire, W. (1992) Retronym watch. New York Times Magazine. November 1 
1992. 

Sheeler, W. & Markley, W. (1991). Sounds and rhythm: A pronunciation 

course. New Jersey: Regents-Prentice Hall. 

Stageberg, N. (1971). An introductory English grammar. New York: Holt, 
Rinehart, & Winston. 

Stevens, S. (1989). A "dramatic" approach to improving the intelligibility of 
ITAs. English for Specific Purposes: An International Journal 9(3), 181 - 

194. 

Taylor, D. (1991). Compound word stress. ELT Journal 45(1), 67 - 73. 



ERIC 




IDEAL 7. 1994 



WHERE PHONOLOGY MEETS ORTHOGRAPHY 



Thomas R. Hofmann 



When phonology is expanded to include more than one dialect and we 
make a ‘cross-dialectal’ phonology of the several standards of English, it 
becomes possible to include the (systematic parts of) orthography as an other 
dialect. This turns out to be what is sorely needed by foreign learners of 
English, as well as by natives learning to read. The role of rules relating 
orthography to pronunciation as well as exceptions are also considered. Some 
implications are sketched for theoretical linguistics as well. 



The main point of this paper is to argue for a principle; that an adequate 
cross-dialectal phonology-a phonology that works for at least the major dialects or accents 
of a language-naturally includes the regular parts of its orthography. If this is true, it has 
serious implications for the teaching of language because language students must also learn 
how words are written. Learning the writing of words together with their pronunciations 
can be more efficient and yield better results. It also has strong implications for theoretical 
linguistics as the native speaker has the writings of words unified in some way with his 
phonology. As well, the regular parts of an orthography influence the phonological structure 
of a language through its historical drift. First, however, we must look at what a 
cross-dialectal phonology is. We will follow here the American usage of ‘dialect’ for a way 
of speaking but are concerned only with the pronunciation (what is sometimes called ‘an 
accent’). 

The idea of a phonology that is valid for the various dialects of a language has been 
an attractive ideal for some generations of linguists, but how it could work has not been 
discovered until recently. Daniel Jones (1950) proposed the ‘diaphone’ as the phonemic unit 
of such a cross-dialectal phonology, but began from the (then universal) idea of a standard 
dialect and looked at the reflections of its phonemic units in other less prestigeous dialects. 
This was not adequate, for it saw all other dialects as more or less accurate renditions of the 
standard. 



Henry Lee Smith’s ‘morphophonemes’ (1967) were units of pronunciation that do not 
vary in different linguistic surrounds— stress and suffixes for the most part (the now standard 
concept)-and at the same time did not vary between dialects. With the great flexibility of 
generative phonology in the 1970’s which reduced morphophonemes to little more than 
Chomsky’s ‘systematic phonemes’, Bailey (1972) tried to describe all dialects with a 
‘pan-dialectal’ phonology. That however was possible only with great difficulty, and 
Luelsdorff (1975) showed it to be irrelevant to the goals of lingusitics. Few if any spiers 
are familiar with all the dialects of a wide-spread language such as English, nor do the 
various dialects influence the development of the language in anything like equal proportions. 
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Yet the goal has persisted. Linguistic theory can describe only the phonology 
possessed by a single speaker (usually an ideal person who speaks a standard). It thus has 
limited relevance to real people who are invariably aware of aspects of other dialects, and 
often enough can speak in more than one way. Indeed, nearly all speakers can vary between 
a relaxed, allegro way of speaking and a more formal and carefully enunciated way. But 
it is not just the relevance of linguistic theory that is at hand. 

When we teach a language to a non-native speaker, it is seldom enough to teach a 
single ideal dialect. For countries with a single high-prestige standard dialect such as Japan 
or France, or even England, the language is taught that way, partly because the standard is 
considered the only option for an educated native or for a foreigner and partly because 
linguistics has nothing else on offer than a single dialect. In North America, there is a 
greater sense of equality among dialects, and foreigners are not necessarily taught the 
contemporary broadcast standard. General American (GA hereafter). Also, the US is 
moving toward losing its post-vocalic /r/.* Teachers use some compromise between their 
native dialect and the broadcast standard. 

It is not enough that the student learn to speak in a standard way, so that natives can 
understand him, he must also learn to understand natives. In strongly centralized languages, 
most natives can make a fair rendition of the standard for the sake of a foreigner, but this 
is far from true of the US, or Scotland, or Australia, for US or UK standards. In effect, 
a student has not learned enough English to converse in general until he has some tolerance 
for varying ways of speaking, and that means he needs to learn at least which of the features 
of pronunciation are common and which are geographically variable, and to have some idea 
of what variation is possible. 

I have found it a useful exercise for students in Japan to give classroom directions 
occasionally in an Australian accent, a US Black accent, a South Bostonian accent, and even 
a Scottish accent, even though I am poor at rendering them, especially the latter. The range 
of pronunciations prepares them for what they may and will meet, and it helps them discover 
what is common to all these varieties. No matter that a teacher is not very good at imitating 
an accent; the imperfect imitations will still guide students to relax their ideas of correct 
pronunciation in the right direction, and to learn what aspects are the same in all dialects. 

This is something that even native children have to do. When one listens to only one 
dialect, one can use features that may be absent in another dialect. Only with experience 
does a person learn to use the features that are common across dialects. And those features, 
I would argue, tend to persist over time and are the really important features in 
pronunciation. 

On a practical level, dictionaries and teaching materials need to indicate pronunciation 
in a way that is valid for different dialects, at least for the two most important standards of 
English, the British and US broadcast standards. Learner’s dictionaries are being produced 
in England, but they give pronunciation primarily in the British standard, though 
progressively better attempts are being made to include GA pronunciations as well. This 
addition of GA may not help students, however. Rather it may confuse them with too much 
information. And, it is not enough for a student living in Georgia and being taught GA but 
hearing Southern varieties around him, on television and even from teachers. 
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As well, except for texts devoted to pronunciation, teaching materials seldom include 
any indication of pronunciation at all, for being unable to choose which pronunciation to 
include. A publisher naturally does not want to limit the sales area of his texts by giving 
only one dialect. 

Faced with these needs, a new notion of cross-dialectal (XD) phonology emerged, 
and has been described in its theory (Hofmann, 1991, 1990). To demonstrate it in the worst 
possible case (other than Chinese), the vowel pronunciations of English are reduced to units 
(diaphones if you will) that are pronounced in slightly different ways in different dialects. 
In a nutsheU, an XD phonology includes a few extra diaphones that are pronounced one way 
in one dialect and an other way in the next. Moreover, it does not try to cover all the 
dialects of English, but only the major ones that all speakers have some knowledge of and 
that all foreigner learners should be aware of. The XD phonology given in the Appendix 
has apparently managed this with only one or two extra vowel units for the English vowel 
sounds, for the GA and the BBC standards of pronunciation. Yet it appears valid for a wide 
range of dialects, from Jamaican and Australian to Scottish and Irish, with a small fudge for 
the latter two. To give an example, Northern English dialects ‘merge’ (i.e. pronounce alike, 
as a single phoneme) <u>and<65>(I will use < ... > to identify diaphones), as in but, 
mud, duck ^x\d foot, cook, could. An other example is that most dialects merge <ur> and 
<ir> as in fUr, fir, while strong Scottish does not. 

When students learn pronunciation with these units, which correspond closely with 
the traditional English notions of long and short values for vowel letters, they are prepared 
for teachers from any of these dialects, or to accomodate to these varieties in daily life. If 
exposed to them early in their education, they will learn to listen mostly to what is invariant 
across dialects. In this way they can be much less sensitive to and more tolerant of 
variation. They will not be like a Japanese colleague of mine, an English teacher who took 
pride in his excellent English, who found on his first visit to the US that he could not 
communicate at all with his taxi driver except in writing! 

Thus are the needs and promises of an adequate cross-dialectal phonology. It has not 
come out of linguistic theory because theoretical interests are pursuing other, ‘hotter’ topics. 
As well, linguistic theory continues to subscribe to a self-limiting assumption of 
Bloomfield’s, namely that there is a ‘community’ of speakers who all talk ^ike, so 
linguistics proper has never bothered to explore the reality of dialectal and register variation 
that is ever-present. Those few that do venture into this area (eg. Bailey, Labov, Smith, 
Jones and, I guess, myself) are seen to be dabbling in interesting but peripheral topics. 

Apparently applied lingusitics will thus have to make its own theory, much as 
engineering disciplines make the theory they need while pure physics pursues the structure 
of quarks. However, the theory promises to become important in linguistics and psycho- 
linguistics because it is a more realistic model of the native speaker. A perhaps bigger 
reason for it to become important is that it implies that the written form of a literate 
language is an important aspect of the phonology that speakers internalize, as we can now 
show. 
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ORTHOGRAPHY 



When people leam another dialect, as when a child learns a standard way of 
pronouncing at school, they combine it with their earlier way of pronouncing to form an XD 
(cross-dialectal) phonology. As such they leam to speak and to understand in both their 
native and the standard ways of pronouncing. How this is done is not known yet, but it is 
done, and may be repeated many times as one learns to speak in a number of fashions. 
Even those bom into a standard dialect who never leam to speak in any other style still leam 
to speak in different registers, most notably a relaxed allegro fashion and a more formal, 
clearly enunciated way. They also leam to understand people from other dialects, eg. 
popular singers. 

Learning a phonetic orthography-a way of writing based on sound-is no different; 
an XD phonology is formed that works for both pronounciation and writing just as one is 
formed to incorporate a new dialect into one’s knowledge. A phonetic orthography is no 
more nor less than a written dialect, and learning it influences one’s perceptions of words 
to the degree that the adult no longer hears sounds but sequences of letters. Learning to read 
and to write is not something that is simply added to an individual’s knowledge of his 
language, but it profoundly influences the person’s knowledge of its structure, not only in 
the details of how some words are pronounced but even in the nature of pronunciation. 

The existence of a few ‘spelling pronunciations’ (considered as errors) is well known, 
but this process is far more extensive than we realize, in English even. There are large 
numbers of words like albino, plaza, Maria and so forth that have two pronunciations. One 
has a short or long value given to the stressed vowel; the other has a ‘continental’ value (as 
in Spanish or many other languages). These latter have been generally gaining ground for 
the last century (Wells, 1982; Bollinger personal communication). One of these 
pronunciations is a spelling pronunciation, showing the importance of spelling in the 
phonological evolution (and thus the structure) of our language. This is why Coulmas (1989) 
could observe that writing systems of the world tend to match the phonological structure of 
the languages they are used with. 

Because a sound-based orthography is internalized, along with a range of pronun- 
ciations, most native speakers of a literate language have a combined representation. This 
is why they can write down a new word, or pronounce a new brand name, without much 
doubt even in a nominally unphonetic language such as English. In fact, it means that words 
are commonly stored mentally in something like written forms and pronounced on that basis. 
The faux amis of French and English are not words with different meanings that are 
pronounced alike; they are words that are written alike. This shows that similar writings 
(with divergent pronunciations) cause problems, not similar pronunciation. As well, 
probably every person who speaks both French and English has made mistakes of 
pronouncing French words in English when they don’t happen to exist in English, and 
vice-versa. Examples can be cited ad infinitum to show that at least between English and 
French (which share a large written vocabulary but very little in pronunciation), it is the 
written form that people pronounce in the other language. If this can happen between 
different languages, as different in pronunciation as English and French, it surely happens 
between different dialects of the same language. When we listen to another dialect, then, 
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we must be making use of written forms and our knowledge of how the other dialect 
pronounces them. 

This idea that writing can be treated as another dialect in an XD phonology will not 
be accepted too quickly in linguistics, for writing has long been excluded as a part of 
language. Yet it must be accepted as a possibility, and in view of the interplay between 
pronunciation and spelling, it must be accepted as plausible. It will be verified by observing 
how children’s pronunciation is modified as they learn to read and write, always in a 
direction that is more consistent with the orthography. 



ENGLISH ORTHOGRAPHY 

Some will deny that English spelling corresponds to pronunciation, but the work of 
Dickerson (1985, 1990, 1992) among others shows that this is not true. On the most basic 
level, a letter P for example corresponds to a phoneme /p/, and essentially^ all English 
speakers know this. Linguists and spelling reformers have often been delighted, or 
discouraged, to note the many examples where the match is not simple and perfect, but if 
we look at each letter of the alphabet in turn, only a few have more than one systematic way 
of being pronounced, given that we are intelligent enough to take some combinations like 
PH, CH, SH as single units. True, there are exceptions to nearly every simple rule, like the 
P in pneumonia, but cast these exceptions aside (and identify them as exceptions to the 
student) and there is a vast regularity in English spelling that any learner or teacher would 
be a fool to ignore. 

Unfortunately, thanks to linguists and to having learned many kanji by rote, Chinese 
and Japanese students often try to learn English words without utilizing the knowledge, eg., 
that P stands for /p/! This is a gross waste of effort to learn each word’s pronunciation and 
its spelling independently, and it fails them when they meet a misspelled word, or a brand 
name or a person’s name, or a new word. When literate natives meet a new name or a new 
word, they may have some doubt as to how it should be pronounced, but seldom can they 
imagine more than two ways to pronounce it. And if they guess wrong, others can still 
understand them because we could all make a wrong guess. This is something foreign 
students must learn, without which they are unnecessarily handicapped in reading and 
writing, even in the minimal sort used in daily life (eg. reading bus stop names, or in a 
supermarket). They must learn the connection between pronunciation and spelling, to the 
level that they can make guesses that we can sympathesize with. 

Given then that there is system and regularity to spelling and pronunciation in 
English, and that the foreign student must learn it, it is far wiser to learn the language with 
these two combined. In learning French, for example, one learns that OU stands for /u/, 
and one does not need to look up each word with OU in it to see how it is pronounced. In 
fact, one wisely forgets the symbol ‘u* and just thinks of the sound /u/ as OU, because 
French U is pronounced /y/. This is learning a language through its (native) orthography, 
and is a far faster and a more useful way of learning if it can be managed. The problem 
with English, however, is that there are so many exceptions, and the rules are so complex 
(Dickerson has been untangling them for some years) that it has generally been thought to 
be impossible. It is not. But to leave it to the students to discover what capable linguists 
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can spend decades on means that they will not do it well. In teaching English speaking 
children to read and spell, as well, leaving it to children to discover first that there is a 
system, in spite of the many exceptions that they learn early, and then what that system is 
in detail, implies that they also do not learn it well, or nearly as fast as they could. Merely 
to identify which words are exceptions would help them immensely. 



In a literate society, most people learn the principles of writing more or less well, 
and the leaders and pace-setters can invariably read and write well. If the orthography or 
writing system is based on sound, and nearly every one is (even Chinese), it is an aspect of 
the language that everyone knows to some extent. 

The validity of XD (cross-dialectal) phonology lies in its accurate description of at 
least the well-known standards. If it includes these, then it is present at least in part in the 
heads of all the speakers. As everyone also knows how to spell words (with some 
exceptions that fill the lists in spelling contests), and these spellings correspond to sounds, 
then an XD phonology must include an orthography for a literate language. This explains 
the naive identification of writing with a language. Except for those who have been taught 
in linguistics that writing is not part of a language,’ perhaps everyone in every literate 
society takes the written form of a word as its ‘real’ form. Thus in English, we ask how 
we should pronounce a word like albino, not how we should write a word [kod]. And the 
answer is with either a long-I or a continental-I — for uncommon words, any way that is 
consistent with the writing. 

As any school child knows, it is far more important to know the correct spelling of 
a word than to know its correct pronunciation, if it has only one (admitting variation in 
dialect and register). The written word has an importance that far outweighs the spoken 
word for the naive, and we can bring this fact into linguistics by accepting the orthography 
as the preeminent dialect. 

Typically, orthographies are more differentiated than any spoken dialect. At least 
some sounds have several ways of being written, even in Spanish, while normally a given 
symbol or combination of symbols has only one proper way of being pronounced 
(conditioned sometimes by context) in a given dialect and register. English is the major 
exception to this latter, having several ways to pronounce each vowel letter that is only 
partly and unreliably (Upward, 1988) marked in the spelling of the rest of the word. 

In being more differentiated than any spoken dialect, an orthography can often serve 
as an XD phonology itself, as suggested above for French, its letters and letter-combinations 
being pronounced in different ways from dialect to dialect. Where it is not sufficient, the 
orthography can be supplemented by diacritic marks to make a ‘learning orthography’ that 
serves as an XD phonology for most if not all spoken dialects and the standard orthography 
itself. 

An example of how this might work for English can be seen in adopting a device 
from French, to add a cedilla (originally a small S written under a C) when a C is 
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pronounced /s/, and extend it to CH when that is pronounced as ///, as in Chicago, machine 
and so on. If such marks are really useful, however, there is a possibility that they will 
come to be adopted into the orthography as has happened in a number of European 
languages. 

I have done this for English vowels, to make an ‘English Teaching Alphabet*, using 
the more or less standard ‘bar’ (macron) and ‘cup* (breve) for long and short values for 
vowel letters (as in previous editions of The Concise Oxford Dictionary), and supplemented 
that with a ‘hat* (circumflex) for continental pronunciations. These are applied to the 5 
vowel letters (Y and W are treated as variants of I and U when needed) to give 15 vowel 
diaphones, plus <do, do, dii, di, au> (see Appendix). It is sufficient to indicate the 
pronunciation of nearly all of the words in major dialects, as well as their written forms (by 
erasing the diacritics). The letters U, O, and A also have a fourth pronunciation each, <do, 
u, 6 > respectively, in a few dozen common words like pUsh, put; cdme, Idve; wdsh, wdrd, 
so these are marked with two dots (diaresis) to have a nearly perfectly self-pronouncing way 
of writing English. There are a dozen exceptional words that must be learned by rote, but 
if pointed out, they should cause no great problem. 

This English Teaching Alphabet (ETA) is thus a good medium for teaching English 
to foreign students, as users learn both the writing and the pronunciation of a word (in all 
major dialects) at once, with little more effort than learning the spelling. And they learn 
without additional effort the connections between sound and spelling. It is also a good 
medium for initial literacy training for native children, like the i.t.a. (initial teaching 
alphabet) or other American systems, without being biased for British or American 
pronunciations, and working well for whatever native dialect the child has. 



APPLICATIONS OF XD PHONOLOGY AND IN PARTICULAR ETA 



There is a recent trend in England to use IPA (International Phonetic Alphabet) 
symbols in dictionaries to show pronunciation. This makes it superficially easier for 
foreigners using a learner’s dictionary to make a passable rendition of the British standard 
pronunciation, for the symbols look like symbols in their own language and have roughly 
the same (but never quite the same) sounds. Furthermore, it looks scientific, as linguistics 
uses those symbols with approximately constant value for any language. Nevertheless is it 
an unnecessary burden for an English learner as well as a native English speaker to learn a 
second way of spelling, for sound only, and it misleads the foreign learner because those 
symbols, as scientific and as international as they are supposed to be, are pronounced 
differently for each language. It is worse than this, however, as those symbols directly 
conflict with the symbols used to write English: 



Eng: 



IPA: 



0..e ^OU 

L'''' L 

/ou/ /au/ 

/o/ 



AU 



EE 



L.e 



/o/ I /i:/ 

I /i/ 



/ai/ 



^ AI 

^ ^/e:/ 
/ei/ 
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This makes it harder to learn to spell or to pronounce from spelling, for both the native 
and the foreigner-an unmitigated blunder in the best British tradition, but some American 
dictionaries are also using a few IPA symbols now, too. 

Up to its most recent, Europeanized edition, the Concise Oxford Dictionary indicated 
pronunciation in an efficient and informative way, called ‘pronunciation without respelling’, 
by simply adding long and short marks to the spelled form of the word. Spelling is thus 
united with pronunciation, not only saving space in dictionaries, but also in learners’ 
memories, and time in learning. Instead of two representations for sound and for writing 
of every word, without a clear relationship between them, one representation with the 
relationship between spelling and pronouncing clearly marked will do. 

Of course, a few exceptionally pronounced words needed to be given a pronunciation 
apart from their spelling, but for nearly all words, this was enough, saving dictionary space 
and user difficulty. If 5% to 10% of an average dictionary entry is devoted to giving the 
pronunciation, getting rid of it saves 5% to 10 % of the size and weight of a dictionary. 
And it is eminently more usable than a bunch of esoteric symbols that conflict with the 
spelling. A comfortable side-effect is to identify which words are exceptions to the general 
system of English - those like busy that must be respelled to show their pronunciation, 
< blzzy > . 

The same applies to learner’s dictionaries. Although it may be a bit more initial work 
learning what the long and short values of the vowel letters are, once they are learned, they 
provide an accurate guide to pronunciation regardless of dialect, and they don’t allow the 
half-learning of taking IPA symbols to have the values of vowel letters in their native 
language. 

What can be said for dictionaries needs to be said many times louder for teaching 
materials. The US Army Language Program in World War II showed that a language can 
be learned in a very short time if the learners are not burdened with illogical and difficult 
writing systems but provided instead with a phonemic transcription. It should have been 
clear that the most efficient way to teach English is to teach the students phonemic 
transcription and let them initially read and write in that only. In fact, it cannot be doubted 
that if a beginning student is given a phonemic transcription of Texan English, for example, 
and can read and write only that for a few months, that he will be speaking rather like a 
Texan, far more so than most students today approach any variety of English. Phonemic 
transcription is a powerftil language teaching device, but it is used only for teaching Japanese 
and some other Asian languages. 

Phomemic transcription is not needed for most European langauges as they have 
diacritic marks that give the exact pronunciation. English however has no such devices, so 
learning it is burdened by having to learn both spelling and pronunciation. 

The excuse for not using phonemic transcription in English classes is that the teacher 
and the student both realize that speaking English is not enough in today’s world. Writing 
is important as a passport for tourists, and rather more important for studying abroad. 
Learning a phonemic transcription would be nice, but is almost universally seen as extra, 
and largely unnecessary work. For most European languages, the orthography can serve as 
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a pretty accurate XD phonology; once students get acquainted with it, they rightly abandon 
IPA symbols and learn instead the language. 

This doesn’t work for English. Exceptions aside, our orthography is an accretion of 
too many different systems of spelling; one cannot look at a word and tell how it should be 
pronounced, at least until one is quite accomplished at English. Even then, one can never 
be sure. 

The usual and highly unsatisfactory syllabus (especially in Japan) includes a course on 
phonetics, where the IPA symbols are introduced and practiced. Besides the fact that the 
class is soon over and may be forgotten with relative impunity, even the learning that does 
go on may be hard to apply. Japanese students are taught to distinguish [i] and [i], for 
example, and many learn it, but they don’t know which words have [i] and which words 
have [i]. They practice this constrast only with 20 or so words in the phonetics class, and 
forget most of those. When they meet any of the thousands of other words with these 
sounds, they don’t know which sound to pronounce. The same, several times worse, goes 
for [ae, a, o, a, or, er], all pronounced the same in Japanese, and hard for them to 
distinguish. 

In the same phonetics class, however, they learn to distinguish [b] and [v], but here 
their knowledge can be applied to speaking, reading, hearing, and even writing. As [b] is 
always written B, and [v] as V, and never the reverse, they know which sound to pronounce 
if they know the written form of the word. Because they have often learned the written 
forms before they go to phonetics class, the phonetic lessons can result in an almost 
immediate improvement in pronunciation outside the phonetics class. This works for the 
consonants in English. There are a few exceptions, primarily silent letters, but these are 
easily noted and remembered. 

It emphatically does not work for vowels in English, as there are alternate spellings for 
many vowel sounds, and several pronunciations for most single vowel letters. A stop-gap 
rule for the example above is to teach students to decide between [i] and [i] by looking at 
the spelling: If it is spelled with an E, as in EE, EA, IE, El, it is surely [i] and never [i], 
but if it is spelled with a simple I, it is almost always [i]. Unfortunately, even this doesn’t 
work perfectly, for there are also a few continental pronunciations, as in police, machine, 
A similar but less perfect rule of thumb can be given for the other group of vowels, based 
on spelling with A, O, U, AR, and UR/IR. 

Thus a more satisfactory approach is to find and teach the rules whereby a native 
English speaker can be fairly confident of the pronunciation of a word (see Dickerson, 
1985), but this is limited by the fact that there are a large number of exceptions and they are 
mostly high firequency words that the foreign student must deal with first. It is also limited 
by the fact that there are many words with vowels pronounced with continental values, and 
they are continually increasing. 

If the rules for pronunciation from spelling were simple as they are in French, German 
or most European languages, that would be enough. In English, however, they are fairly 
complex, and also unreliable (eg. the one about a stressed vowel being ‘long’ if separated 
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from a following vowel by only one letter). Thus we need diacritics for pronunciation until 
the learner can internalize the rules. 

Granted that students must eventually learn such rules, either by being taught or by 
discovering them themselves, there is a better way for beginning students than to load them 
with a pile of rules and turn them loose on English orthography. The ETA described above 
is simply a few diacritics added to ordinary spelling, but it gives the pronunciation 
accurately. The combinations of diacritics with vowel letters must be learned, but that is 
needed anyway to learn the rules above. And once it has been learned, students can be 
given text with ETA diacritics (the teacher can add them as needed) and forced to write with 
them, too. In this way, students can approach the efficiency of learning with a phonemic 
transcription, with very little extra work, and be learning the written form of the language 
at the same time. 

I am thus convinced that beginning materials need to be supplemented with such 
diacritics.^ Beginning students will naturally learn orthography and pronunciation in a 
coordinated and effortless fashion, and will no doubt discover some of the rules for 
pronunciation without prompting. Moreover, as ETA is valid for most dialects, learners will 
be much less confused by teachers from different regions who pronounce differently. 

It thus seems clear that language learners will benefit greatly from adopting some form 
of ETA marking for beginning students. Publishers need not shy away from indicating 
pronunciation as it applies to all major dialects. Last, if the marking is withdrawn for the 
vowel pronunciations as they are explained, the learners can discover for themselves or be 
told the rules (and their limitations). And this can be done one rule at a time, separating the 
task into independent subcomponents for faster learning. 

The benefits of ETA do not stop there, however, as those same materials can be a great 
help in native literacy training. Teaching reading with the i.t.a. (initial teaching alphabet), 
which was phonemic for standard British pronunciation, showed that children could learn to 
read in incrediby less time than by the ordinary orthography with all its conflicting patterns 
and exceptions. While the ETA has extra symbols, often several for any vowel phoneme 
in any particular dialect, it is neutral between dialects, and has no strange and foreign- 
looking symbols that teachers have to be taught and parents need to get used to. As children 
have little problem in learning several symbols for the same thing (eg. ‘4’, ‘IV’, ‘four’), 
teaching reading through ETA promises much the same benefits as the i.t.a. without its 
drawbacks - and there is not even a conceptual problem in their transition to ordinary 
orthography (the i.t.a. -trained children had few real problems, though it seemed like they 
ought to have). As such, I hope to see native language reading texts sprouting ETA 
diacritics someday, to allow us to catch up to the rest of the world in literacy. 
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NOTES 

^With the rise in status and power of the Southern states as well as Black speakers. 
Consider the number of recent presidents without this /r/: Kennedy, Johnson, Carter, and 
Clinton — or the dialect of political power (Washington). 

^All but for explicable exceptions, as eg. the blind, deaf and so on. 

^There was, and still is, good reason for this dogma, but dogma it is, and is easily 
shown to be partially false, at least for a literate language. Not least is the point here, that 
it is the preeminent dialect. 

"*1 have made a close definition of ETA available to teachers or publishers who want to 
use it. It is, however, copyrighted to avoid a thousand varieties arising, or its misuse. If 
you want to do something similar, please call it something else, but please also consider the 
disadvantages of having two or five or ten different competing systems. 
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APPENDIX 

English Teaching Alphabet 



Non-R Stressed Vocalic Complexes 


Examples 


65 


void, boy 


1 


time, pipe 


e 


bead, leap 


1 


sit, lip 


a 


late, ape 


e 


let, peck 


a 


lap, sat 


32 


laugh, mass 


Si 


father, lager 


6 


cop, lock 


62 


loss, often 


Sn 


laud, audible 


0 


load, rope 


u 


luck, up 


66 


look, push 


6 b 


loop, room 


u 


mute, pewter 


6 ii 


loud, house 


Stressed Vocalic Complexes + R 


It 


fire, lyre 


er 


near, beer 


ar 


square, pair 


Sr 


lark, bark 


or 


or, horse 


or 


ore, hoarse 


ur 


fur, fir 


obr 


moor, tour 


ur 


cure, obscure 


oiir 


flour, flower 
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IDEAL 7, 1994 



ATTITUDE AS A DESCRIPTION OF INTONATIONAL MEANING 



John M. Levis 



One of the most common explanations for the meaning of intonation is 
the expression of speaker attitude, usually described through the use of 
specific adjectival descriptors (e.g. angry, surprised). However, the concept 
of attitude in intonational studies is fundamentally different from attitude as 
it is used in research in the social sciences. Important differences between 
the two traditions of attitude research are discussed and it is argued that 
attitude is an inadequate concept to describe intonational meaning because of 
the idiosyncratic definition of attitude used for intonational research. 



Intonation performs many tasks in English, from signalling grammatical changes 
(Halliday, 1967) to expressing emotions and involvement (Bolinger, 1986) to highlighting 
the information structure of discourse (Bardovi-Harlig, 1986; Brown & Yule, 1983) to 
implicating certain types of meaning (Ward & Hirschberg, 1985). One of the most common 
explanations of intonational meaning, however, is the expression of speaker attitude. In fact, 
some researchers have argued that this is the essential meaning of intonation (Pike, 1945; 
O’Connor & Arnold, 1963). 

The assumption that intonation conveys speaker attitude is so common in writings on 
intonation as to be almost an article of faith. Pike (1945, pp. 21-22), one of the most 
influential proponents of this position, says: 

In English.. .an INTONATION MEANING modifies the lexical meaning of 
a sentence by adding to it the SPEAKER’S ATTITUDE toward the contents 
of that sentence (or an indication of the attitude with which the speaker 
expects the hearer to react)... .An extraordinary characteristic of intonation 
contours is the tremendous connotative power of their elusive meanings. One 
might hastily and erroneously assume that forms which change so rapidly and 
automatically could not be semantically potent. Actually, we often react more 
violently to the intonational meanings that to the lexical ones; if a man’s tone 
of voice belies his words, we immediately assume that the intonation more 
faithfully reflects his true linguistic intentions. Thus, if someone says, h 
breaffast ready yet? the sentence is either innocuous or an insult according 
to whether it is spoken nicely or nastily-and if the insult is resented, the 
speaker defends himself by saying, I just asked if breaffast were ready and 
she flew into a rage. This illustrates the fact that the intonation contours, 
though fluctuating like the speaker’s attitude, are as strong in their implication 
as the attitude which they represent; in actual speech, the hearer is frequently 
more interested in the speaker’s attitude than in his words-that is, whether 
a sentence is "spoken with a smile" or with a sneer. 
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The obvious facts that the same string of words can be spoken with different prosodic 
characteristics and that the choice of prosody will communicate different messages are the 
basis of Pike’s assertion that intonation conveys a speaker’s attitude. This, in fact, he sees 
as the primary function of intonation, although Ladd (1980) argues convincingly that attitude 
is conveyed by lexical choices, syntactic structures and other linguistic choices in addition 
to intonation. 

The meaning of the term "attitude" shows some variation in the intonational 
literature. Primarily, it is expressed in one of two ways. The first is exemplified by Pike 
(1945), who expressed altitudinal meanings in general terms, with the basic meaning of the 
intonation contour being the lowest common denominator of all the uses of the intonational 
pattern. Thus the falling contours all have the basic meaning of "Contrastive Pointing" of 
"Center of Attention" (1945, p. 44).' An assumption of this approach is that intonation 
contours are morphemic, that is, that each significant contour is a meaning carrying unit in 
the message. The basic morphemic meaning can be interpreted to provide other local 
meanings, such as "surprise, unexpectedness, detachedness" in addition to others, depending 
on the kind of falling contour used and the sentence it occurs with. The specific labels (e.g. 
surprise) are not the primary meanings of intonation, but are interpretations based on the 
more basic, abstract meaning. The second approach is exemplified by the British writers 
O’Connor and Arnold (1963). Although they espouse much the same basic position as Pike, 
they focus on the specific attitude labels common to non-experts. Thus a particular 
intonation means things as varied as reserving judgment, reproving criticism, resentful 
contradictions, wondering, disapproving, menacing, skepticism, calm warning, exhortation, 
or calm, casual acknowledgement (for Tone group 6, see pp. 48-53). There is little 
discussion about how all these meanings might fit together. Though the level of generality 
is an important difference between the two positions, both agree on the essential issue: 
intonation expresses a speaker’s attitude toward the message. 

The belief that intonation expresses speaker attitude is intuitively logical. Clearly, 
prosody changes the interpretation of an utterance. However, the claims that intonation 
expresses a speaker’s attitude are based on an idiosyncratic concept of attitude that is not 
shared by mainstream attitude research in the social and psychological sciences. The 
difference between the view of attitude used in intonational studies and the one used in the 
mainstream of attitude research necessitates an examination of the assumptions that underlie 
two separate traditions of attitude definition and measurement. The differences between 
these two traditions show that studies of intonation and attitude do not define or measure 
aUitude in the way that the vast majority of other attitude research does. Because the 
differences are so fundamental, there is strong reason to believe that intonational research 
on attitude is both inaccurate and misleading. 



ATTITUDE IN THE SOCIAL SCIENCES 

In 1935, Gordon Allport said that "the concept of attitude is probably the most 
distinctive and indispensable concept in contemporary American social psychology" (p. 3). 
The concept lost none of its popularity in the ensuing years and in 1967, Martin Fishbein 
said that the enormity of the research literature and its proliferation in a myriad of sources 
made it practically impossible for anyone to keep up with the research on attitude. Although 
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the amount of research done on attitude is so large as to be impossible to digest, intonational 
studies show an almost total ignorance of both the research done and the definitions used. 

The early years of attitude research were preoccupied with the definition of attitude 
and its measurement. Early definitions tended to focus on the behavioral consequences of 
the assumed inner state, on the assumption that a direct link existed between attitude and 
behavior. In this vein. Allport (1935, p. 7-8) reviews many older and then current 
definitions of attitude and concludes that the "essential feature of attitude (is) a preparation 
or readiness for response. " Thus attitude is the precondition for behavior and may exist in 
varying strengths and readiness, from that which is deeply buried to that which will affect 
behavior most readily. However, despite the behavioral focus of this definition. Allport also 
criticizes the view that attitudes are merely readiness for behavior as being insufficient to 
distinguish attitudes from habits. 

Although the search for a connection between attitudes and actions has continued, the 
difficulties of a strictly behavioral definition of attitude were shown very early by Richard 
LaPiere (1934) in a study that examined racial prejudice of lodging and restaurant owners 
both by compiling their mailed responses to questionnaires and by being confronted with two 
potential Chinese guests. For two years beginning in 1930, LaPiere traveled extensively 
throughout the United States with a young Chinese couple. Previous social distance studies 
indicated that prejudice would cause lodging to be denied because of race in a majority of 
cases. However, out of 251 encounters, only one ended in a rejection.^ This result was so 
surprising that a questionnaire was distributed after the two year period to the same 
establishments, asking specifically if Chinese people would be accommodated in the 
establishment. The response was 91% negative for those who responded, with the remainder 
being uncertain, and no one answering yes. This difference from the actual behavior in 
these establishments shows the operation of other factors between the polled attitude and the 
actual behavior. 

In 1947, Doob wrote that attitude is "o/z implicit, drive-producing response considered 
socially significant in the individual's society" (p. 42). The drives produced by attitudes 
were no longer behavioral in the sense of overt actions, but were instead primarily within 
the individual and not necessarily apparent to other people. In addition, the responses to 
attitudes can be an internal perception of a situation, thoughts, images, stereotypes, linguistic 
responses or overt action (p. 47). Significantly, the social importance of attitudes is built 
into a definition for the first time. 

Later attempts at definition continued to look for correlations between attitude and 
behavior without any promising results. The inability of attitude research to predict behavior 
led some researchers to reject attitude research as hopelessly irrelevant (Rimland, 1976) 
while others have instead tried to find reasons for the mismatch (McGuire, 1976). What is 
clear, however, is that there is very little correlation between what people describe as their 
attitudes and what they actually do. 

The difficulty of treating behavior within the definition of attitude led to a 
multidimensional definition of attitude. Attitude, rather than being strictly evaluative, was 
thought to have three components: the affective, cognitive (beliefs), and connative (actions) 
components. Osgood (1967) calls these three factors the evaluative, potency and activity 
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factors, but he argues that attitude loads primarily on the evaluative component. In other 
words, while research points to the existence of at least kinds of affective meaning, the 
evaluative component is the heart of what most researchers call attitude. Even though 
traditional conceptions of attitude were found wanting, the multidimensional definition was 
very difficult to use in practice. This led to an attempt to restrict attitude to a smaller 
domain while trying to account for the other elements of meaning in another way. Fishbein 
(1965) argues the other elements of meaning are not attitude, but rather are part of the 
beliefi of an individual. The cognitive aspect he treats as the belief in an object, while the 
connative component would be beliefs about an object. Thus belief in the existence of God 
would be cognitive, while agreement that God is all-powerful would be connative. In 
contrast, feeling comfort or terror at the prospect of an all-powerful God would be 
evaluative, and would, Fishbein argues, be attitudinal. Fishbein argues that progress can 
best be made in attitudinal research if the definition of attitude is limited to the evaluative 
component and by treating the other components as part of belief rather than attitude. 

Since the 1960s, questions about the definition of attitude have not been a major 
element in the social sciences, as though the issue had been settled as well as could be hoped 
for. The consensus view is that attitude is essentially a positive or negative evaluation of 
some social issue. Attitude is assumed to be measurable despite the difficulties in definition 
and is also assumed to be a reflection of psychological reality. In intonational studies, in 
contrast, the definition of attitude is far from settled and is considered by some researchers 
to be the basic problem involved in describing intonational meaning (Ladd et al., 1985). 



The mainstream of attitude research has its own set of assumptions about what 
attitudes are and how they can be inferred and measured. This research tradition stands in 
contrast to the attitude research which focuses on the meaning of intonation, which makes 
very different assumptions about the nature of attitude and how it should be measured. 
Representative examples of studies in the social sciences and in intonational research will 
show the most important differences between the conception of attitude in each tradition. 
Although the variety of designs used in the social sciences is too great to look at in detail, 
attitude studies in the social and psychological sciences have typically sought to measure the 
attitudes held by groups of people toward a particular concept. 

Mainstream attitude research will be exemplified by an early study done by Thurstone 
in the early 1930s which examined the influence on conduct from either belief or lack of 
belief in God (Shaw & Wright, 1967, p. 2750). A series of statements bearing on the 
influence of such beliefs was constructed, tested and scaled on an 1 1 -point Thurstone type 
scale. Subjects then responded to each statement, and the combined results of the responses 
were taken to be an indirect measure of the attitude toward the concept. The typical 
operational definition of attitude in mainstream attitude research is described by McGuire 
(1967, p. 10). 

Typically, the person’s attitude toward an object on some dimension of 
variability is measured by presenting the person with a proposition that 
specifies the object and the dimension of judgment, accompanied with a 
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response scale on which the person can make a mark which indicates where 
the person feels this object should be assigned on this dimension of 
variability. 

It is common for production experiments, where speakers try to express particular 
emotions or attitudes, to feed into perception experiments (decide which attitude is expressed 
by this sentence). However, production experiments can and have stood alone, as in many 
intonation and emotion experiments (see Scherer, 1979). Intonational research on attitude 
follows a different approach. I will focus on perception experiments as a comparison to the 
Thurstone study described above.^ In this type of research, subjects make judgments about 
sentences using adjectives describing affective states (e.g. friendly, angry). Scherer et al. 
(1984), for example, used nine adjectives to represent attitudes. The adjectives had as little 
overlap as possible with one another while providing an adequate choice to the subjects 
among different types of attitudes. Subjects were instructed to identify with an all of 
the adjectives that described the attitude of the speaker and to indicate with two ”X”s the 
adjectives that were extremely appropriate (two Xs were used in only nine percent of the 
responses). While subjects could choose as many adjectives as they liked, the results showed 
that only one or two were chosen for each utterance. 

There are a number of important differences between the Thurstone and the Scherer 
et al. studies that reveal fundamental differences between the two traditions, each of which 
will be discussed below. These are the object of the attitudes, the way in which attitudes 
are inferred, the nature of attitude, the measurement scales, and the use of social groupings 
in subject selection. 

First, in the social sciences, attitudes always exist in relation to some social concept. 
The concept may be as concrete as a particular government policy, or it may be as abstract 
as belief in God. However, there must be a concept that one can hold attitudes about. 
Attitudes do not exist in the abstract, but only in relation to a concept. As the concept 
changes, so will the attitudes. It is not possible to ask what someone’s attitudes are without 
asking what the attitudes are about. 

Intonation studies, in contrast, are not interested in measuring evaluations of a 
concept. Instead, the goal of this tradition has been to determine the kind of message that 
is conveyed through intonation. It could be argued that intonation itself is the concept being 
evaluated. However, this position runs into serious difficulties of both definition and 
measurement. First, such a position assumes that the intonation itself is an identifiable social 
concept. While intonation performs social functions, there is no evidence that each 
intonation pattern can be defined in the same way that social issues like race relations or 
government policy are. Evaluating intonation as a social concept in its own right assumes 
two things: 1) intonation has a context independent meaning and, 2) that meaning can be 
accurately described with labels such as friendly ^ impatient, or doubtful. Many researchers 
have argued that intonation does make an independent contribution to the meaning of an 
utterance (Pike, 1945; Liberman, 1975; Ladd, 1980; Ward & Hirschberg, 1985; 
Pierrehumbert & Hirschberg, 1990). However, there is abundant evidence that judgments 
of intonation differ in relation to sentence type (e.g. falling pitch is judged differently for 
yes/no than for WH questions; see Uldall, 1960; Scherer et al., 1984), lexical content (Ladd, 
1980), and context (McLemore, 1991). Even if it is granted that intonation patterns have 
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context independent meanings, the meanings must be at a level of abstractness that is far 
beyond the rather specific labels used in most intonational studies. There is another 
argument against considering intonation as a social concept in its own right. In most 
traditional studies, a concept is evaluated through some kind of language medium. However, 
the intonational signal is only one element of a language stimulus and does not exist outside 
of other elements of language."^ Even if intonation were judged independently as a concept 
(as with humming), such judgments would ultimately be meaningless since intonation can 
only find interpretation in conjunction with other elements of language. It is clear that 
changes in intonation can change the interpretation of an utterance. Whether it is possible 
to describe this difference in terms of attitude as it is generally understood is unlikely. 
Intonation is too much a part of the overall message to be evaluated as a social concept in 
its own right. In summary, the question asked by intonational studies seems to be whether 
subjects agree on the interpretation of an utterance said in a particular way, not how the use 
of particular intonational patterns signals positive or negative evaluation toward a social 
concept. The clear difference between the conception of attitude in intonational and non- 
intonational studies calls into question whether the two traditions are really measuring the 
same construct. 

Second, in the social and psychological sciences, attitude is inferred by response to 
statements that encode propositions about the concept in question. Since attitude is not 
directly observable nor can it be inferred from any single statement, studies use a large 
number of statements that bear on the concept, some of which are negatively weighted and 
others which are positively weighted. Inferring attitudes based on one statement, or even 
a small number of statements, would rightly call the validity of the conclusions into question. 

Intonational studies, on the other hand, use a very small number of sentences that 
deliberately avoid any relationship to a social context. Uldall (1960, 1964), for example, 
chose four sentences (a statement, a yes/no question, a WH question, and a command), each 
of which she synthesized with 16 pitch patterns. She said that "the sentences were intended 
to be as colourless as possible so as to allow the intonation to add as much as possible to 
their meaning, and so that they would fit into as many situations as possible when combined 
with different intonations” (1960, p. 224). Keeping the sentences as decontextualized as 
possible has been a virtue in intonational research since it has been important to maximize 
the impact of the intonation. In a much later study, Ladd et al. (1985, p. 436) chose three 
sentences that also could be consistent with many different attitudes. Again, the three 
sentences used in the study were decontextualized and unrelated to each other. In other 
words, what is essential in non-intonational studies (viz., many statements scaled in relation 
to a social concept) is avoided in intonational studies, which use very few decontextualized 
and unrelated sentences. 

Third, in non-intonational studies, attitude is seen as a judgment that is essentially 
positive or negative in nature. No specific labels (e.g. anger, surprise) are attached to the 
positive/negative judgments made. Osgood, Suci, & Tannenbaum (1957) argue that attitude 
is essentially evaluative but they also carefully avoid speaking about specific labels as being 
attitudes. The semantic differential, Osgood’s measurement instrument designed to measure 
meaning, uses specific adjectival labels, as seen below. However, the specific labels are 
important only insofar as they help define a point in a semantic space. They have no 
importance in and of themselves. Osgood et al. (1967, p. 26) state: 
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What is meant by "differentiating" the meaning of a concept? When a subject 
judges a concept against a series of scales, e.g., 

FATHER 



happy : : : : : : sad 

hard : : : : : : soft 

slow : : : : : : fast, etc. 

each judgment represents a selection among a set of given alternatives and 
serves to localize the concq)t as a point in the semantic space. The larger the 
number of scales and the more representative the selection of these scales, the 
more validly does this point in the space represent the operational meaning of 
the concept. 

In contrast, most intonational studies focus on the specific labels rather than a general 
evaluative judgment. The labels are the goal rather than being a means to an end. This is 
most clearly seen in the production oriented studies which have attempted to discover the 
acoustic correlates of particular labels (Scherer, 1979; Crystal, 1969). Lieberman and 
Michaels (1962, p. 922), for example, asked subjects to read eight sentences in eight 
different ways: a bored statement, a confidential communication, a question expressing 
disbelief, a message expressing fear, one expressing happiness, an objective question, an 
objective statement, and a pompous statement.* After analyzing the sentences acoustically, 
they chose the best acoustic examples for a matching test. In Scherer et al. (1984), an initial 
list of 250 adjectives were sorted by judges into those that were suitable and unsuitable for 
judging the speech stimuli (in this case, a variety of yes/no and information questions). 
From those that were suitable, nine adjectives "that overlapped as little as possible and 
provided an adequate choice to the subjects" were chosen (p. 1348). Thus the essential 
criterion for the construction of this type of scale is that the adjectives be as distinct as 
possible from one another. The need to do this in itself indicates the difficulty of 
distinguishing the meanings of the labels, a difficulty discussed in some detail in Ladd 
(1980). A later study (Ladd et al., 1985) also used a set of carefully contrasting attitudinal 
and emotional labels. Although results reached significance for most of the labels, the 
authors were dissatisfied and concluded that such labels were too imprecise to really describe 
the meaning of the intonation. They say that (p. 442) 

Perhaps the most important weakness of this study, and indeed of the whole 
general area of research, is the absence of a widely accepted taxonomy of 
emotion and attitude . Not only does this make it difficult to state hypotheses 
and predictions clearly, but (on a more practical level) it makes it difficult to 
select appropriate labels in designing rating forms. 

Although the authors argue that what is needed is a better set of labels, most mainstream 
attitude researchers have decided that such a specification is neither necessary nor possible. 
Since attitude is an evaluative judgment that is scaled positively or negatively, there is no 
need to specify as carefully as intonation researchers wish. One social science attitude 
researcher, instead of specifying a taxonomy, takes the opposite position and includes 
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attitude, emotion, and other terms in what he calls "acquired behavioral dispositions" 
(McGuire, 1976, p. 12). 

Some related constructs are so close to attitudes in their connotation that they 
should be used interchangeably. Falling into this category, with decreasing 
obviousness, are terms such as opinions, beliefs, cognitions, prejudices, 
preferences, percepts, concepts, values and interests. Going slightly further 
afield, intervening constructs such as habits, motives, emotions, etc., have 
somewhat less connotational overlap with attitudes but play quite comparable 
roles in our thinking.... Stress on distinguishing these constructs is not only 
wasted effort but, more seriously, tends to blind us to communalities and the 
possibilities of analogous thinking from what is known on one of the 
constructs to what we are seeking to know about the others. 

The only intonational studies that use labels to get at more basic issues of meaning 
were conducted by Uldall (1960, 1964). Both studies used Osgood’s semantic differential 
as a measurement instrument. Uldall found evidence for all three meaning factors described 
by Osgood et al. (1957), that is, for the evaluative, potency, and activity meaning types. 
Evaluative meaning was the most powerful of the three factors, accounting for over 50 
percent of the variance. These studies then lend support to the idea that intonation expresses 
attitude, because the evaluative factor in Osgood et al. (1957) is argued to be most nearly 
equivalent with traditional conceptions of attitude. A difficulty with the studies is that they 
use an untenable taxonomy of intonation (Ladd, 1980). Because the definition of intonation 
is difficult to support, her results can only be suggestive of the kind of affective meaning 
conveyed by intonation. 

Fourth, in mainstream studies attitudes are typically measured along a scale. 
Although attitude is essentially a positive or negative evaluation of a concept, there are 
different degrees of positive or negative that are built into the measurement instruments. For 
example, a typical Thurstone type scale has eleven points. Statements about the concept 
being evaluated are sorted by experts along the eleven point scale according to their relative 
positiveness or negativeness toward the concept. The results of the sorting for several 
experts are then averaged and weighted. When subjects agree or disagree with a statement, 
the weighted score becomes part of the measure of their attitude toward the concept. The 
time intensiveness of the Thurstone scale led to the development of the Likert scale, in which 
statements are self-scaled along a five-point measure according to the degree of agreement 
with a statement. In both cases, however, the attitude judgment is not only positive or 
negative but is also scaled for degree. The scales used are minimally ordinal in nature, and 
some theorists have argued that many attitude scales are for all practical purposes interval 
scales (Osgood et al., 1967, p. 85; Shaw & Wright, 1967, p. 20flP). 

Intonational studies, in contrast, typically use measurement instruments with nominal 
variables. In one apparent use of a scale, Scherer et al. (1984) asked subjects to choose the 
best adjectives to describe the attitude conveyed by a speaker. Subjects could mark one X 
or two XXs depending on how strongly the utterance represented a particular label. 
However, less than nine percent of the responses used two XXs, creating an essentially 
nominal scale. Other studies have not allowed a scaling of the particular adjective labels, 
i.e. it has not been possible to say that a sentence with a particular intonation conveyed more 
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or less of a particular affective state. Because the labels are essentially variables measured 
on a nominal scale-an important assumption of attitude research-the use of degrees of 
approval/disapproval, is lacking in intonational research on attitude. 

Fifth, in the social sciences, attitude research often compares groups of subjects in 
regard to a particular issue. This is possible because attitudes relate to social concepts and 
scales are meant to maximally distinguish between groups which are in a different 
relationship to the concept. For example, Southern white males and Northern white males 
could be targeted for their attitudes toward capital punishment, since there may be some 
reason to believe that these two groups differ in significant ways regarding the concept. 

Intonation studies, on the other hand, do not compare subjects based on social 
groupings. Instead, these studies seem to assume that all speakers of a language will react 
in much the same way to particular intonations because they are part of the same larger 
speech community. Since there is no concept being evaluated, the kind of grouping that is 
done in attitude studies in the social sciences is not really possible with intonation studies. 
Before such groupings can be done, it is necessary to have some theoretically motivated 
criterion for choosing groups that are likely to differ in responses to intonation. At present, 
no such criterion exists. Very few studies have examined the social patterning of intonation, 
and most of those have focused on regional variations (e.g. British vs. American English). 
In a beginning step toward a description of social patterns of intonation, McLemore (1991) 
strongly argued for the social patterning of intonational patterns in social groups at the 
micro-level. In a way, it is not surprising that there is very little research that examines the 
attitudes of diverse social groups to intonation. Without a resolution of the more 
fundamental issue, viz., evaluation of a social concept, the measurement of attitudes is really 
not possible in intonation studies. 

The differences in attitude studies in the two traditions differ are shown in Table 1 . 



DIFFICULTIES WITH INTONATIONAL STUDIES 

Intonational research on attitude has never yielded very good results, primarily 
because attitude is not a useful concept to describe intonational meaning. The studies 
themselves display the serious weaknesses inherent in attempting to define meaning with 
specific affective terms. Ladd (1980) points out that two studies conducted in the 1960s 
show conflicting evidence about the ease with which subjects can identify specific affective 
descriptors. Lieberman and Michaels (1962) argue that boredom is one of the most 
consistently identified labels, while Osser (Ladd, 1980) found that it is one of the most 
consistently confused labels. The reason for the discrepancy, Ladd says, is that Lieberman 
and Michaels chose labels that did not easily confuse while Osser included calmness and 
sadness along with boredom, both of which were confused with boredom. As a 
consequence, later studies have tried to choose labels that are maximally distinct from one 
another. This was the criterion for the rating labels chosen in the Scherer et al. (1984) and 
the Ladd et al. (1985) studies. 

Crystal (1969), in an instructive study on the value of specific labels, conducted a 
production experiment to discover the labels that had well-defined contrastivity for different 
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speakers. He chose twenty labels and had six subjects, three men and three women, read 
sentences in such a way as best represented each label. For all subjects, 

some labels had a well-defined contrastivity and were highly meaningful to 
the informants (for example, ‘angry’, ‘impatient’, ‘boring’, ‘questioning’, 
‘pleased’); others had very little clear formal definition, and were useless as 
indicators of attitude.... Substantial disagreements over all but a few labels 
emerged from a relatively small number of informants, which suggests the 
existence of deep divisions of usage in these respects (p. 307). 

After analysis of the phonetic detail for each of the labels. Crystal chose 20 
utterances which he felt were particularly good performances of each label, and constructed 
a matching task similar to the one used by Lieberman and Michaels (1962). The same six 
subjects who read the sentences were asked to match the label to the attitude being 



Table 1 

Differences between two attitude research traditions 



Traditional Attitude Studies 


Intonational Studies 


1. Seek judgments about a social 
concept. 


1 . Evaluate intonation apart from any 
other concept. 


2. Use a series of statements or 
propositions by which the concept 
can be evaluated. 


2. Use decontextualized sentences 
which are meant to be colorless and 
disconnected from any concept. 


3. Measure the positive or negative 
reaction to a concept but avoid 
labeling the evaluation with specific 
descriptors. 


3. Use measurement instruments which 
use specific labels rather than 
positive or negative responses to a 
concept. 


4. Measure attitude on an ordinal or 
interval scale. 


4. Measure attitude on a nominal scale. 


5. Compare distinct groups and assume 
that a sample of subjects differ in 
their attitudes toward a concept based 
on social differences. 


5. Assume that all speakers will have 
similar judgments of the attitudes 
conveyed by intonation. Expect 
uniformity of response rather than 
patterned distinctions based on social 
factors. 



expressed. The best score was 60 percent correct identification, and no two subjects made 
the same set of identifications and misidentifications. Several weeks later. Crystal asked the 
same subjects to label the attitudes expressed by the sentences, this time without labels being 
provided to them. The best identification percentage fell to 20 percent, and there was a 
strong tendency to use two to three labels to describe the attitude. In addition, the six 
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subjects used nearly 100 labels in contrast to the 20 original labels. These results bear out 
Crystal’s major point about labels, that "the way in which individuals use such labels is not 
constant: a label’s meaning may vary from person to person, and even within one person 
from time to time" (p. 295). 

If this kind of inconsistency occurs even for a small number of subjects who are asked 
to label sentences that they themselves have read and have identified previously, there is very 
little reason to assume that what is needed are more carefully defined descriptors. Instead, 
the nature of affective meaning must be at a more general level than that assumed by the 
descriptors. While it is true that people consistently make judgments that someone sounds 
happy or pompous or sad or anything else, it is simply counter-productive to assume that 
intonational meaning can best be understood in terms of these specific labels. Judgments of 
happiness or sadness are made through a complex interpretive scheme involving the entire 
communication context, and though intonation is a part of this context, in itself it cannot 
narrow the judgment to a particular label or set of labels. 

While there are good operational reasons for choosing maximally distinct labels 
(especially the fact that the results are unlikely to show any significant identifications when 
the labels have similarities of some kind), the very need to keep the labels so distinct argues 
that the specific labels are inadequate and that what subjects are identifying is a more general 
affective category. The labels are only a symbol of something more abstract, and are chosen 
because they come closest to the general affective reaction. 

Not all writers on intonation have been enthralled with attitude as a description of 
intonational meaning. While most studies have demonstrated that intonation conveys 
affective meaning of some sort, the directness of the link between intonation and particular 
"attitudes" is not clear. McLemore (1991, p. 136), for example, states that individual labels 
can characterize very different pitch patterns. Thus anger, a heavily studied affective state, 
has been associated with both high and low pitch. Such variation shows that there are 
mediating influences between the intonation and the local interpretation. Attitude research 
has sought to specify the final interpretation without an adequate understanding of the 
inferential processes involved in interpreting particular tunes (i.e. intonation contours) and 
without an understanding of the role of context in affecting particular interpretations. 
McLemore (1991, p. 134-135) hints at the extent of the difficulties inherent in attempting 
to specify intonational meaning in terms of speaker attitude. 



...the interpretation of an intonational form in terms of attitude involves imputing 
certain interactive and communicative goals to a speaker. This is accomplished 
through a complex interaction between the patterned uses of a given form in the 
culture generally, situations specifically, and discourses (i.e. the indexical value of 
a form in the culture); evaluation of spier’s [j/c] intentions in terms of a socially 
recognized role, including the relative status with which that role endows the 
speaker; and aspects of the immediate discourse context, including topic, and their 
interaction with expectations for communicative behaviors in the specific situation 
in the culture. 
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Pierrehumbert & Hirschberg (1990, p. 284) also criticize speaker attitude as insufficient 
to characterize intonational meaning. 

Though speaker attitude may sometimes be inferred from choice of a particular 
tune, the many-to-one mapping between attitudes and tune suggests that attitude is 
better understood as derived from tune meaning interpreted in context than as 
rq>resenting that meaning itself.... Neither speech acts nor propositional attitudes-at 
least as standardly understood-appear to provide sufficient characterizations for 
available tunes in English. 

The inability to specify particular attitudes for particular tunes in English shows that attitude, 
as it is conceived of in intonational research, is too simplistic to be useful. 



CONCLUSION 

Ladd et al. (1985) suggest that most of the difficulties in intonation and attitude research 
could be solved with a more precise taxonomy of attitude and emotion. However, the 
contention of this paper is that the way that attitude has been characterized in intonational 
research is not a profitable way to describe the meaning of intonation. The specific labels 
used to describe attitudes are a way to describe the inference made from the speech signal 
in a particular context, but it is not possible to consistently identify the labels, nor to corre- 
late particular pitch patterns with particular labels. Intonational studies of attitude have 
sought a precision that is not likely to be possible no matter what kind of a taxonomy of atti- 
tude and emotion is developed, since the desire to precisely label attitudinal meaning with 
particular labels is fundamentally flawed and resembles mainstream attitude research in name 
only. 

This raises a question as to whether it is possible for intonation researchers simply to 
use a different term for "attitude” and have a viable research agenda. If there was no 
connection to mainstream attitude research at all, would there then be validity in the same 
line of research? In other words, is the problem only one of terminology, or is there 
something more fundamentally wrong with the approach taken in intonation research? I 
suggest that there is a fundamental flaw in the approach that invalidates the approach whether 
intonation research uses the word attitude or not. 

Clearly, the speech signal carries messages that can be described as "affective" (Ladd 
et al., 1985). These messages can be conveyed by intonation contours, the placement of the 
nucleus, the pitch range, voice quality, amplitude, speech tempo, gesture and many other 
suprasegmental and paralinguistic elements (Crystal, 1969; Bolinger, 1986, 1989). 
However, previous research has shown both the impossibility of associating specific labels 
to any of these elements alone (Crystal, 1969) and the tendency of listeners to not 
consistently distinguish labels that are not maximally distinct (Ladd, 1980), thus suggesting 
that different listeners cannot agree on specific labels. It has also been shown that 
mainstream attitude research long ago began using specific labels only to show general 
orientations to a topic. It is reasonable to assume, then, that the failure to find adequate 
labels to describe intonational meaning, whether the labels are said to describe attitude, 
emotion or any other term that may be used instead, stems from a flaw in the approach. 
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especially because the desired specificity goes against mainstream attitude research. It also 
goes against the focus of intonational research, in which basic meanings are maximally 
abstract rather than specific (Pierrehumbert & Hirschberg, 1990; McLemore, 1991). More 
specific meanings can likely only be calculated with reference to a more fully developed 
theory of pragmatics, since fundamental frequency and other suprasegmental elements 
contribute to the meaning of an utterance only in conjunction with other elements of the 
utterance (Ladd, 1980; Pierrehumbert & Hirschberg, 1990).* Even granting a more fully 
developed theory of pragmatics that will allow calculations of specific meanings, it is 
unlikely that the specificity hoped for by some researchers is possible or desirable. It is hard 
to imagine, for example, a calculation that would allow a clear distinction to be made 
between words like "c^m," "bored," and "matter-of-fact." Language does not allow such 
specificity. Instead, there are always misunderstandings and different interpretations, with 
listeners hearing what speakers never intended, both parties creating and adjusting meaning 
through the interaction. 

One possible use for the research on intonation and affect is in distinguishing categorical 
differences in contours. This was the approach taken by Hirschberg and Ward (1992) in 
distinguishing between two interpretations consistently given to the rise-fall-rise intonation 
(L*-f H L H% in Pierrehumbert’s terms). The labels "incredulity" and "uncertainty" were 
used to investigate the contribution of different features to differing interpretations of the two 
contours. The labels were used not to argue that certain features meant "incredulity" or 
"uncertainty" but to argue for abstract meaning differences associated with certain features, 
specifically differences in speaker involvement. While it is well-known that listeners, even 
trained listeners, do not consistently hear phonetic differences in intonation, they do seem 
able to draw distinctions between messages conveyed by changes in intonation 
(Pierrehumbert, 1980). Thus, this knowledge can be used to help support distinctions in 
intonation (see also Pierrehumbert & Steele, 1989). However, using specific labels as a 
description of meaning will only lead to wasted energy. Intonational meaning has never 
been described well with such an approach, not because we lack a clear enough taxonomy 
of attitudes and emotions, but because such labels can only suggest speaker affect. 
Intonational meaning may eventually be described with such labels, but only if we 
understand that the labels can at best remain general descriptions that are the final output of 
many linguistic and paralinguistic elements. 
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NOTES 

*Pike explains the basic meaning this way. "The word or syllable which contains the 
beginning point of the primary contour is singled out as the CENTER OF SELECTIVE 
ATTENTION of the spe^er, or constitutes the demand by the speaker that the hearer focus 
his attention at that point” (1945, 44). This meaning is similar to that posited for the nucleus 
in other accounts. 

^LaPiere tried to remove himself as a potential factor in the treatment accorded the 
Chinese couple. He tried to send them in to the establishments first to ask for accom- 
modations while he took care of the luggage or car so that the owners’ responses would be 
more clearly based on their race. 

^Intonational studies can be divided into two basic types: production studies and 
perception studies. In production studies, subjects are asked to read sentences so as to 
express particular attitudes or emotions, that is, to read the sentences excitedly, confidently, 
angrily, etc. Lieberman and Michaels (1962), for example, asked subjects to read eight 
sentences in eight different ways (e.g. pompous, confidential, angry). The sentences were 
then analyzed acoustically to find common prosodic characteristics for each label. Sentences 
that were particularly good examples of each label were then used for a perception 
experiment. It is common for production experiments, where speakers try to express 
particular emotions or attitudes, to feed into perception experiments. However, production 
experiments can and have stood alone, as in many intonation and emotion experiments (see 
Scherer, 1979). 

^Intonation is ignored as a variable in mainstream attitude research, in which the 
stimuli are written and not oral. 

^This study is instructive because of the mixing of grammatical and affective 
categories (e.g. statement - happiness). This mixing calls into question whether subjects 
performed the matching task well because they heard only the affective categories or because 
they matched stimuli to grammatical categories as well. 

^Pike (1945) also recognized that the attitudes conveyed by intonation were first and 
foremost abstract meanings rather than specific. Thus while he argued strongly for inton- 
ational meaning being attitudinal, he did not pretend that the meaning could be best described 
with specific labels. 
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